How important is the refactoring you’re working on? Michael Feathers has a metric you should consider when deciding: plot the complexity against the churn.
Chad Fowler’s turbulence gem does just that.
The “complexity” is some arbitrary number of how tricky the code is to read; this can range from the amount of indentation to the amount of metaprogramming, and everything in between. This is computed using flog.
“Churn” simply refers to how often a file changes; a file that’s changed a bunch has high churn. Highly-churned files should be easy to modify, since people do it a ton. This is computing using git.
Using it is simple: drop this in your Gemfile:
group :development do
gem 'turbulence'
end
Then run it:
bundle exec bule
Watch out, it’ll open a new tab in the browser for you:

There’s a single method in your Cucumber suite, right now, that could silently allow regressions and ambiguity to slip in without you ever knowing. Guess which one! Actually, bonus points to you if you scroll down to the comments and make a guess before finishing this post.

Did you guess? If not, here are a few more hints.
It uses regular expressions.
It’s a large cascading conditional statement.
It’s well-intentioned, and serves as a central lookup and naming convention for what would otherwise be magic constants.
Ready for the reveal? It’s HtmlSelectorsHelpers#selector_for. Yikes! Now
that I’ve named names, let’s play everyone’s favorite game, spot-the-regex-bug:
module HtmlSelectorsHelpers
def selector_for(locator)
case locator
# snip a few lines
when /the "([^"]+)" story/
story = Story.find_by_title!($1)
"##{dom_id(story)}"
# snipped another 100+ lines
when /^the related link for the "([^"]+)" story$/
story = Story.find_by_title!($1)
"a#some_selector_for_story_#{story.id}"
end
end
Spot it? This one is fairly easy to find - a step definition like:
When I click on the related link for the "Make some bread" story
would match both regular expressions, because the first regular expression
doesn’t match against the beginning ^ or end $ of the line. Since the
case statement just evaluates from the top down, the first match wins.
How did I find that? I was adding a new piece of functionality to click on the related link, wrote that new selector, ran the test, and it failed. I wrote my passing implementation. Still failed. I tried it out in my browser, where things worked normally. It took some sleuthing to figure out that the wrong selector was being used, and where it was coming from.
Some other issues are actually lurking somewhere in the 100+ snipped lines of regexes. Let’s say you add another selector like this:
when /^the comment button for the "([^"]+)" story$/
# ...
If that is listed after the /the "([^"]+)" story/ matcher, it will never be
matched. If you’re not careful, even if you are disciplined about reaching the
red state in TDD while adding the new feature, this can lead to subtle
regressions later down the line. Consider a link or piece of content that gets
moved around on the page: if the broad regexp still matches it, you could end up
with a false positive test.
So, what’s the general issue here? When I think about the set of selectors I
implement, I believe it’s a design drawback to allow developers to add
ambiguously matching selectors. It would be better if the selector_for method
were to give you early notice of multiple colliding matches.
What might such an implementation look like? Well, if we used a construct other
than the case statement, then selector_for could evaluate all possible
matches, and raise an exception if there are multiple matches. Consider this example:
def selector_for(locator)
selectors = {
"the page" => "html > body",
/the project dropdown/ => '#project-selection ul',
/the "([^"]+)" story/ => lambda {
story = Story.find_by_title!($1)
"##{dom_id(story)}"
},
/^the related link for the "([^"]+)" story$/ => lambda {
story = Story.find_by_title!($1)
"a#some_selector_for_story_#{story.id}"
}
}
matches = selectors.map do |selector, result|
if locator =~ selector
if result.respond_to?(:call)
result.call
else
result
end
end
end.flatten
raise "Ambiguous matches!" if matches.size > 1
matches.first
end
(This is just an illustration. I’ve no idea what the performance impact is here, or if the above code works. You could smooth out the interface a bit, too.)
So - problem solved! We can identify ambiguous matches. Party hats all around.
But is this the best we can do? The root of ambiguity here is regular expressions, and you, dear astute reader, have recognized that these selectors are being used in Cucumber steps, and not just within step definition implementations. Hopefully, you’re climbing into the commenting cockpit, ready to take aim at the notion of this coupling of front-end implementation to the big-A Acceptance test suite - a notion that has gotten so much blog heat lately.

“Take that, blogger! Feel the wrath of my disagreement!”
In fact, selector_for was
removed in cucumber-rails 1.1.0.
For new apps, it’s a good change. And we should upgrade our existing apps, and
generally move away from imperative steps if we’re using them. But is there
still something to selector_for that’s worth keeping around?
Myself, I’ve liked using selector_for in the definitions of acceptance test
steps because it provides a central and well-named lookup for front-end
implementation details. But I don’t like-like it. Really, we’re trading one
set of magic constants (CSS selectors) for another (named selector strings), and
the presence of regexp-based dispatch in the land of plain old Ruby methods starts to
feel a bit out of place. Unnecessary. Even bulky:
When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
click_link(selector_for("the related discussion link for #{story_title}"))
fill_in 'Comment', with: "What fun discourse"
click_button 'Create Discussion'
end
Wouldn’t it be nice to just extract these to plain old Ruby methods?
module HtmlSelectorsHelpers
def related_discussion_link(story_title)
story = Story.find_by_title!(story_title)
"a#some_selector_for_story_#{story.id}"
end
end
World(HtmlSelectorsHelpers)
# ...
When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
click_link(related_discussion_link(story_title))
fill_in 'Comment', with: "What fun discourse"
click_button 'Create Discussion'
end
Now, we have less clunky dispatch and more cohesive step definitions which operate at a single level of abstraction.
It’s also good because this moves away from encouraging the coupling of Cucumber steps to implementation details. It encourages less string matching, and more plain Ruby method-calling.
But what about our original issue, ambiguity? It’s still possible to overwrite
a method in the HtmlSelectorsHelpers module, although I suppose it’s less
likely - the real reason I got tripped up at the beginning was that
too-general regular expression. I think we’ve done well to mitigate that source
of bugs.
The one trouble with this approach is that it’s not incremental. I can’t easily
convert all my existing selector_for code to use just methods in one fell
swoop. This is the same pain point felt when talking about removing web_steps
from an existing suite - it’s messy and time-intensive to change all that
string-matching lookup into method dispatches. But, worth it? I’d argue that
it is.
If you’re writing Cucumber, do you use selector_for, spurn it despite its
availability, or use a new version without the method? Do you place your
selectors directly in step definitions? Do you have some other lookup facility?
A somewhat lighter week of pull requests. Perhaps we were super busy; I know we were obsessesed with refactoring.
Our Heroku deployer, kumade, is being used as a series of refactoring lessons toward cleaner code. Unobtrusive Ruby uses more classes, and they are doing that like whoa here. Gabe Berke-Williams (gabebw) is leading the charge, switching to Mocha and Bourne (57a4c7f and 530c137), making a Heroku class (82c75d2), and making a Configuration class for internal use (469761b and 681bc8f). The tests saw major improvements from this, with shared contexts and better stubbing (bd93c9d, 15a65df, ea5331d, 7c20edc, 40882df, 30e0b3e, 5b66e24, 7c147ae, be5de51, and ee40dbb). He updated the README to explain that Kumade is no longer a set of Rake tasks (abb3609) then removed the rookie mistake of using !! to turn a truthy value into a proper Boolean (4279a2f).
Ever committer Marcos Tapajos (tapajos) fixed the README to say staging instead of bamboo (ee04efc) and also removed references to running the tests (56fd720), added a -v verbose flag (16e8482 and 5a4af13), got in on the making-things-into-classes fun with a Packager class (56b11b3), and fixed an infinite recursion bug on Heroku Cedar (645014c). He also bumped the version; I don’t know how Gabe feels about this (153316a).
Our Josh Clayton (joshuaclayton), who loves adding classes, cleaned up the Configuration class (fe4be8b) before adding a CLI class (2879fbc).
The one-off announcement system for Rails, paul_revere, saw README updates from Matt Jankowski (mjankowski) where he showed the pass/fail status in public (3881938) and a tiny formatting change (6b3d2cf).
Our shoulda-matchers gem, which is a collection of RSpec and Test::Unit matchers/assertions, saw some continuous integration fixes and some test fixes to make it all work. Prem Sichanugrist (sikachu) stayed atop Travis CI, making it more resilient to edge cases (8007a1a, 45242cf, a1eb080, and 272699f). The tests themselves were made more resilient, fixing the dependencies (78b5e5f), only modifying the Gemfile as needed (94aacd8), and adding a missing table (3e6463b). Also, watch your whitespace (a6fa83e)!
The client for our Copycopter product, copycopter_client, was updated for Rails 3.1. Joe Ferris (jferris) adding support for the DEBUG environment variable during the test run (556bb30), which I’m sure helped him track down the fact that it needs to send the application/json MIME type (eb8b45c), throw exceptions instead of raising them, and skip bundler for 3.1.0 (70fd5ca). He also handled more SSL errors (e72f2a6). This all ended with the release of version 1.1.1 of copycopter_client (d2b2f28 and b74ad61).
Our automatic ARel methods, pacecar, was updated by Matt Jankowski (mjankowski) to run CI using 1.9.2 (b430f50).
Our base stylesheet gem, flutie, also saw improvements from Matt Jankowski (mjankowski). The README was updated to use a code block for the examples (cf35a18) and show Travis CI results (e99ab3d and 457a7aa). The specs were updated to use an in-memory database instead of writing to disc (356f05d). Then a bug was discovered: the Rake task was not copying files into your app properly. Fixed (bc70e1b) and released as 1.3.2 (4b37b89).
Finally, a light week for paperclip, our Rails image uploader gem, and all to the README. goutham (gouthamvel) documented how to use Paperclip outside of Rails but with ActiveRecord (487ab74 and e4339af) while Gabe Berke-Williams (gabebw) fixed typos (f12e2e8).
Likewise, another light week for factory_girl, our test fixture gem. Gabe Berke-Williams (gabebw) fixed the formatting of the getting started document (67fc3bc). Omar Vargas (ovargas27) pointed out the cucumber env.rb file (c09ec8e, 6c30ae3, and 677341b). Joe Ferris (jferris) bucked the trend of modifying the getting started document to instead make Travis CI only build the master branch (861e70e).
Our headless JavaScript test driver, capybara-webkit, saw a bug fix from Matthew Mongeau (halogenandtoast) and Joe Ferris (jferris). They discovered that a HTTP POST that is then followed by a redirect re-sends the Content-Type header, leading to subtle bugs. This is now fixed (25fe9be). Joe also added more info on QT to the documentation wiki (c5e6396).
Good refactoring has a goal. The goal of this refactoring is to make it easier for people to hook into Paperclip.
To start I surveyed existing gems, plugins, and apps that hook into Paperclip. I took a quick glance through Github and found that delayed_paperclip, mongoid-paperclip, papermill, paperclip-extended, and paperclip-facecrop all served as good sample gems that change Paperclip in unique ways. My end desire was to remove the need for monkey patching, and if I could go one further to remove the need for subclassing that’d be even better.
Papermill is a sample Rails app that uses Paperclip, extending it with watermarking, copyrighting, and a custom geometry string. It does this by extending the existing thumbnail Paperclip processor, overriding two methods: the constructor and transformation_command. The new constructor does a ton of parsing: the geometry string is parsed for options indicating a desire to add a copyright or watermark, the copyright or watermark string is extracted from the options, and cropping offsets are parsed out too. It uses all of this to manipulate the transformation_command, like this: it invokes the super’s transformation_command, which produces a string. It then subs the string, sticking various bits set from the constructor into the middle of the command. The watermarking aspect is particularly tricky, since it must change the convert command to a convert ... | composite command, using string manipulation.
This will not do.
The first change I made was to parameterize the geometry parsing (3f7aee3 and eebc7d9). Now if you want a Thumbnail-based processor but with a custom geometry parser you can do that right from has_attached_file:
has_attached_file :avatar,
:styles => {:medium => {
:geometry => '80x60#10x5:2x1',
:string_geometry_parser => CroppingGeometryCommandParser.new
}}
This change looks a little like this. Before:
module Paperclip
class Thumbnail < Processor
def initialize(file, options = {}, attachment = nil)
super
geometry = options[:geometry] # this is not an option
@file = file
@target_geometry = Geometry.parse(geometry)
@current_geometry = Geometry.from_file(@file)
end
end
end
After:
module Paperclip
class Thumbnail < Processor
def initialize(file, options = {}, attachment = nil)
super
geometry = options[:geometry] # this is not an option
@file = file
@target_geometry = (options[:string_geometry_parser] || Geometry).parse(geometry)
@current_geometry = (options[:file_geometry_parser] || Geometry).from_file(@file)
end
end
end
The change? Instead of hardcoding the Geometry class it instead takes it as an argument. The geometry parsing can now be replaced by any arbitrary object. In the tests, for example, instead of creating files that parse to the proper size or instead of stubbing, you can just pass an object that has a #from_file method and produces the correct object. The Geometry class is still the default for legacy reasons.
I split these into the string_geometry_parser and file_geometry_parser because they do different tasks: one takes a file and produces an object that knows how to produce the desired scaling and cropping arguments (lets call this the transformation), and another produces an object that is passed to the transformation.
With that small but powerful change out of the way I took a look at paperclip-extended. This is an old gem; many of these extensions have been rolled into Paperclip by now. The only thing that was impossible without monkey patching was the extended interpolations. For example, normally you can have a file path like /system/images/:rails_env/:id/i:basename.:extension and the :rails_env, :id, :basename, and :extension will be replaced with proper values—but it was impossible to add new interpolation values, such as :normalized_basename (removing non-alphanumerics).
The fix for this (7478455) is similar to that for geometry parsing: pass a custom interpolator to the Attachment through has_attached_file:
has_attached_file :avatar,
:interpolator => InterpolatorWithNormalization.new
This change takes place in the Attachment class. Before:
module Paperclip
class Attachment
def interpolate(pattern, style_name = default_style)
Paperclip::Interpolations.interpolate(pattern, self, style_name)
end
end
end
After:
module Paperclip
class Attachment
def initialize(name, instance, options = {})
@interpolator = (options[:interpolator] || Paperclip::Interpolations)
end
def interpolate(pattern, style_name = default_style)
@interpolator.interpolate(pattern, self, style_name)
end
end
end
Again, a straight-forward change that makes this class easier to test and also adds more extensibility.
This is dependency injection. It is a very boring and simple principle that goes like this: abstract out class names where possible. This gains you both ease of testing and straight-forward hooks for extensibility.
You just saw two examples of it from a real project, so I’ll follow it up with an example from nothing:
Before:
class Car
def gas
puts "Vroom putter putter putter"
end
def brake
puts "screeeetch!"
end
end
After:
class Car
attr_accessor :engine
def initialize(engine)
@engine = engine
end
def gas
engine.go!
end
def brake
engine.stop!
end
end
This gains you testing ease:
describe Car do
let(:engine) do
Class.new do
def go!
@has_gone = true
end
def stop!
@has_stopped = true
end
def has_gone?
@has_gone
end
def has_stopped?
@has_stopped
end
end.new
end
subject { Car.new(engine) }
it "goes" do
car.gas
engine.should have_gone
end
it "stops" do
car.brake
engine.should have_stopped
end
end
… and flexibility:
yuppie = Person.new(:car => Car.new(HybridEngine.new(:diesel => 40, :electricity => 60))
bro = Person.new(:car => Car.new(V12Engine.new(:mpg => 2))
Now you get to use this. Things you can do:
Net::HTTP use a passed-in object; same for any credit card processing, Tweeting, file-writing, logging, and so on.It is a bad idea to do a switch based on an attribute of another object. If you must, it should be on your own data, not someone else’s. - Refactoring by Martin Fowler
—
We’re reading Refactoring as part of an internal book club. This quote reminds me of the Feature Envy code smell.
class Cart
def price
@item.price + @item.tax
end
end
If most of the work in a method is being done on another object (@item), the method should be on that other object.
If you haven’t played with Kevin Rutherford’s Reek gem before, it’s a neat way to find smells like Feature Envy in your Ruby code.