giant robots smashing into other giant robots

Written by thoughtbot

lolconomy

Watch for turbulence

How important is the refactoring you’re working on? Michael Feathers has a metric you should consider when deciding: plot the complexity against the churn.

Chad Fowler’s turbulence gem does just that.

The “complexity” is some arbitrary number of how tricky the code is to read; this can range from the amount of indentation to the amount of metaprogramming, and everything in between. This is computed using flog.

“Churn” simply refers to how often a file changes; a file that’s changed a bunch has high churn. Highly-churned files should be easy to modify, since people do it a ton. This is computing using git.

Using it is simple: drop this in your Gemfile:

group :development do
  gem 'turbulence'
end

Then run it:

bundle exec bule

Watch out, it’ll open a new tab in the browser for you:

Caveats

  • It’s just a number. It doesn’t replace human thought, passion, and desires.
  • Only works on Ruby 1.8!
  • Outliers make the plot nearly impossible to read.

jasonmorrisontb

The one-method silent killer in your Cucumber suite

There’s a single method in your Cucumber suite, right now, that could silently allow regressions and ambiguity to slip in without you ever knowing. Guess which one! Actually, bonus points to you if you scroll down to the comments and make a guess before finishing this post.

This is real Jeopardy, right?

Did you guess? If not, here are a few more hints.

It uses regular expressions.

It’s a large cascading conditional statement.

It’s well-intentioned, and serves as a central lookup and naming convention for what would otherwise be magic constants.

“I’ll take ‘deprecated methods’ for a thousand, Alex.”

Ready for the reveal? It’s HtmlSelectorsHelpers#selector_for. Yikes! Now that I’ve named names, let’s play everyone’s favorite game, spot-the-regex-bug:

module HtmlSelectorsHelpers
  def selector_for(locator)
    case locator

    # snip a few lines

    when /the "([^"]+)" story/
      story = Story.find_by_title!($1)
      "##{dom_id(story)}"

    # snipped another 100+ lines

    when /^the related link for the "([^"]+)" story$/
      story = Story.find_by_title!($1)
      "a#some_selector_for_story_#{story.id}"
  end
end

Spot it? This one is fairly easy to find - a step definition like:

When I click on the related link for the "Make some bread" story

would match both regular expressions, because the first regular expression doesn’t match against the beginning ^ or end $ of the line. Since the case statement just evaluates from the top down, the first match wins.

How did I find that? I was adding a new piece of functionality to click on the related link, wrote that new selector, ran the test, and it failed. I wrote my passing implementation. Still failed. I tried it out in my browser, where things worked normally. It took some sleuthing to figure out that the wrong selector was being used, and where it was coming from.

Digging deeper

Some other issues are actually lurking somewhere in the 100+ snipped lines of regexes. Let’s say you add another selector like this:

when /^the comment button for the "([^"]+)" story$/
  # ...

If that is listed after the /the "([^"]+)" story/ matcher, it will never be matched. If you’re not careful, even if you are disciplined about reaching the red state in TDD while adding the new feature, this can lead to subtle regressions later down the line. Consider a link or piece of content that gets moved around on the page: if the broad regexp still matches it, you could end up with a false positive test.

So, what’s the general issue here? When I think about the set of selectors I implement, I believe it’s a design drawback to allow developers to add ambiguously matching selectors. It would be better if the selector_for method were to give you early notice of multiple colliding matches.

What might such an implementation look like? Well, if we used a construct other than the case statement, then selector_for could evaluate all possible matches, and raise an exception if there are multiple matches. Consider this example:

def selector_for(locator)
  selectors = {
    "the page" => "html > body",

    /the project dropdown/ => '#project-selection ul',

    /the "([^"]+)" story/ => lambda {
      story = Story.find_by_title!($1)
      "##{dom_id(story)}"
    },

    /^the related link for the "([^"]+)" story$/ => lambda {
      story = Story.find_by_title!($1)
      "a#some_selector_for_story_#{story.id}"
    }
  }

  matches = selectors.map do |selector, result| 
    if locator =~ selector
      if result.respond_to?(:call)
        result.call
      else
        result
      end
    end
  end.flatten

  raise "Ambiguous matches!" if matches.size > 1

  matches.first
end

(This is just an illustration. I’ve no idea what the performance impact is here, or if the above code works. You could smooth out the interface a bit, too.)

So - problem solved! We can identify ambiguous matches. Party hats all around.

But wait, there’s more

But is this the best we can do? The root of ambiguity here is regular expressions, and you, dear astute reader, have recognized that these selectors are being used in Cucumber steps, and not just within step definition implementations. Hopefully, you’re climbing into the commenting cockpit, ready to take aim at the notion of this coupling of front-end implementation to the big-A Acceptance test suite - a notion that has gotten so much blog heat lately.

Rambo

“Take that, blogger! Feel the wrath of my disagreement!”

In fact, selector_for was removed in cucumber-rails 1.1.0. For new apps, it’s a good change. And we should upgrade our existing apps, and generally move away from imperative steps if we’re using them. But is there still something to selector_for that’s worth keeping around?

Myself, I’ve liked using selector_for in the definitions of acceptance test steps because it provides a central and well-named lookup for front-end implementation details. But I don’t like-like it. Really, we’re trading one set of magic constants (CSS selectors) for another (named selector strings), and the presence of regexp-based dispatch in the land of plain old Ruby methods starts to feel a bit out of place. Unnecessary. Even bulky:

When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
  click_link(selector_for("the related discussion link for #{story_title}"))
  fill_in 'Comment', with: "What fun discourse"
  click_button 'Create Discussion'
end

Wouldn’t it be nice to just extract these to plain old Ruby methods?

module HtmlSelectorsHelpers
  def related_discussion_link(story_title)
    story = Story.find_by_title!(story_title)
    "a#some_selector_for_story_#{story.id}"
  end
end

World(HtmlSelectorsHelpers)

# ...

When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
  click_link(related_discussion_link(story_title))
  fill_in 'Comment', with: "What fun discourse"
  click_button 'Create Discussion'
end

Now, we have less clunky dispatch and more cohesive step definitions which operate at a single level of abstraction.

It’s also good because this moves away from encouraging the coupling of Cucumber steps to implementation details. It encourages less string matching, and more plain Ruby method-calling.

But what about our original issue, ambiguity? It’s still possible to overwrite a method in the HtmlSelectorsHelpers module, although I suppose it’s less likely - the real reason I got tripped up at the beginning was that too-general regular expression. I think we’ve done well to mitigate that source of bugs.

Bringing it on home

The one trouble with this approach is that it’s not incremental. I can’t easily convert all my existing selector_for code to use just methods in one fell swoop. This is the same pain point felt when talking about removing web_steps from an existing suite - it’s messy and time-intensive to change all that string-matching lookup into method dispatches. But, worth it? I’d argue that it is.

If you’re writing Cucumber, do you use selector_for, spurn it despite its availability, or use a new version without the method? Do you place your selectors directly in step definitions? Do you have some other lookup facility?

lolconomy

This week in open source

A somewhat lighter week of pull requests. Perhaps we were super busy; I know we were obsessesed with refactoring.

kumade

Our Heroku deployer, kumade, is being used as a series of refactoring lessons toward cleaner code. Unobtrusive Ruby uses more classes, and they are doing that like whoa here. Gabe Berke-Williams (gabebw) is leading the charge, switching to Mocha and Bourne (57a4c7f and 530c137), making a Heroku class (82c75d2), and making a Configuration class for internal use (469761b and 681bc8f). The tests saw major improvements from this, with shared contexts and better stubbing (bd93c9d, 15a65df, ea5331d, 7c20edc, 40882df, 30e0b3e, 5b66e24, 7c147ae, be5de51, and ee40dbb). He updated the README to explain that Kumade is no longer a set of Rake tasks (abb3609) then removed the rookie mistake of using !! to turn a truthy value into a proper Boolean (4279a2f).

Ever committer Marcos Tapajos (tapajos) fixed the README to say staging instead of bamboo (ee04efc) and also removed references to running the tests (56fd720), added a -v verbose flag (16e8482 and 5a4af13), got in on the making-things-into-classes fun with a Packager class (56b11b3), and fixed an infinite recursion bug on Heroku Cedar (645014c). He also bumped the version; I don’t know how Gabe feels about this (153316a).

Our Josh Clayton (joshuaclayton), who loves adding classes, cleaned up the Configuration class (fe4be8b) before adding a CLI class (2879fbc).

paul_revere

The one-off announcement system for Rails, paul_revere, saw README updates from Matt Jankowski (mjankowski) where he showed the pass/fail status in public (3881938) and a tiny formatting change (6b3d2cf).

shoulda-matchers

Our shoulda-matchers gem, which is a collection of RSpec and Test::Unit matchers/assertions, saw some continuous integration fixes and some test fixes to make it all work. Prem Sichanugrist (sikachu) stayed atop Travis CI, making it more resilient to edge cases (8007a1a, 45242cf, a1eb080, and 272699f). The tests themselves were made more resilient, fixing the dependencies (78b5e5f), only modifying the Gemfile as needed (94aacd8), and adding a missing table (3e6463b). Also, watch your whitespace (a6fa83e)!

copycopter_client

The client for our Copycopter product, copycopter_client, was updated for Rails 3.1. Joe Ferris (jferris) adding support for the DEBUG environment variable during the test run (556bb30), which I’m sure helped him track down the fact that it needs to send the application/json MIME type (eb8b45c), throw exceptions instead of raising them, and skip bundler for 3.1.0 (70fd5ca). He also handled more SSL errors (e72f2a6). This all ended with the release of version 1.1.1 of copycopter_client (d2b2f28 and b74ad61).

pacecar

Our automatic ARel methods, pacecar, was updated by Matt Jankowski (mjankowski) to run CI using 1.9.2 (b430f50).

flutie

Our base stylesheet gem, flutie, also saw improvements from Matt Jankowski (mjankowski). The README was updated to use a code block for the examples (cf35a18) and show Travis CI results (e99ab3d and 457a7aa). The specs were updated to use an in-memory database instead of writing to disc (356f05d). Then a bug was discovered: the Rake task was not copying files into your app properly. Fixed (bc70e1b) and released as 1.3.2 (4b37b89).

paperclip

Finally, a light week for paperclip, our Rails image uploader gem, and all to the README. goutham (gouthamvel) documented how to use Paperclip outside of Rails but with ActiveRecord (487ab74 and e4339af) while Gabe Berke-Williams (gabebw) fixed typos (f12e2e8).

factory_girl

Likewise, another light week for factory_girl, our test fixture gem. Gabe Berke-Williams (gabebw) fixed the formatting of the getting started document (67fc3bc). Omar Vargas (ovargas27) pointed out the cucumber env.rb file (c09ec8e, 6c30ae3, and 677341b). Joe Ferris (jferris) bucked the trend of modifying the getting started document to instead make Travis CI only build the master branch (861e70e).

capybara-webkit

Our headless JavaScript test driver, capybara-webkit, saw a bug fix from Matthew Mongeau (halogenandtoast) and Joe Ferris (jferris). They discovered that a HTTP POST that is then followed by a redirect re-sends the Content-Type header, leading to subtle bugs. This is now fixed (25fe9be). Joe also added more info on QT to the documentation wiki (c5e6396).

lolconomy

A Paperclip Refactoring Tale: Part One: Dependency Injection

The Beginnings

Good refactoring has a goal. The goal of this refactoring is to make it easier for people to hook into Paperclip.

To start I surveyed existing gems, plugins, and apps that hook into Paperclip. I took a quick glance through Github and found that delayed_paperclip, mongoid-paperclip, papermill, paperclip-extended, and paperclip-facecrop all served as good sample gems that change Paperclip in unique ways. My end desire was to remove the need for monkey patching, and if I could go one further to remove the need for subclassing that’d be even better.

Papermill is a sample Rails app that uses Paperclip, extending it with watermarking, copyrighting, and a custom geometry string. It does this by extending the existing thumbnail Paperclip processor, overriding two methods: the constructor and transformation_command. The new constructor does a ton of parsing: the geometry string is parsed for options indicating a desire to add a copyright or watermark, the copyright or watermark string is extracted from the options, and cropping offsets are parsed out too. It uses all of this to manipulate the transformation_command, like this: it invokes the super’s transformation_command, which produces a string. It then subs the string, sticking various bits set from the constructor into the middle of the command. The watermarking aspect is particularly tricky, since it must change the convert command to a convert ... | composite command, using string manipulation.

This will not do.

The Abstractions

The first change I made was to parameterize the geometry parsing (3f7aee3 and eebc7d9). Now if you want a Thumbnail-based processor but with a custom geometry parser you can do that right from has_attached_file:

has_attached_file :avatar,
  :styles => {:medium => {
    :geometry => '80x60#10x5:2x1',
    :string_geometry_parser => CroppingGeometryCommandParser.new
  }}

This change looks a little like this. Before:

module Paperclip
  class Thumbnail < Processor
    def initialize(file, options = {}, attachment = nil)
      super

      geometry             = options[:geometry] # this is not an option
      @file                = file
      @target_geometry     = Geometry.parse(geometry)
      @current_geometry    = Geometry.from_file(@file)
    end 
  end
end

After:

module Paperclip
  class Thumbnail < Processor
    def initialize(file, options = {}, attachment = nil)
      super

      geometry             = options[:geometry] # this is not an option
      @file                = file
      @target_geometry     = (options[:string_geometry_parser] || Geometry).parse(geometry)
      @current_geometry    = (options[:file_geometry_parser] || Geometry).from_file(@file)
    end 
  end
end

The change? Instead of hardcoding the Geometry class it instead takes it as an argument. The geometry parsing can now be replaced by any arbitrary object. In the tests, for example, instead of creating files that parse to the proper size or instead of stubbing, you can just pass an object that has a #from_file method and produces the correct object. The Geometry class is still the default for legacy reasons.

I split these into the string_geometry_parser and file_geometry_parser because they do different tasks: one takes a file and produces an object that knows how to produce the desired scaling and cropping arguments (lets call this the transformation), and another produces an object that is passed to the transformation.

With that small but powerful change out of the way I took a look at paperclip-extended. This is an old gem; many of these extensions have been rolled into Paperclip by now. The only thing that was impossible without monkey patching was the extended interpolations. For example, normally you can have a file path like /system/images/:rails_env/:id/i:basename.:extension and the :rails_env, :id, :basename, and :extension will be replaced with proper values—but it was impossible to add new interpolation values, such as :normalized_basename (removing non-alphanumerics).

The fix for this (7478455) is similar to that for geometry parsing: pass a custom interpolator to the Attachment through has_attached_file:

has_attached_file :avatar,
  :interpolator => InterpolatorWithNormalization.new

This change takes place in the Attachment class. Before:

module Paperclip
  class Attachment
    def interpolate(pattern, style_name = default_style)
      Paperclip::Interpolations.interpolate(pattern, self, style_name)
    end
  end
end

After:

module Paperclip
  class Attachment
    def initialize(name, instance, options = {})
      @interpolator = (options[:interpolator] || Paperclip::Interpolations)
    end

    def interpolate(pattern, style_name = default_style)
      @interpolator.interpolate(pattern, self, style_name)
    end
  end
end

Again, a straight-forward change that makes this class easier to test and also adds more extensibility.

The Reasonings

This is dependency injection. It is a very boring and simple principle that goes like this: abstract out class names where possible. This gains you both ease of testing and straight-forward hooks for extensibility.

You just saw two examples of it from a real project, so I’ll follow it up with an example from nothing:

Before:

class Car
  def gas
    puts "Vroom putter putter putter"
  end

  def brake
    puts "screeeetch!"
  end
end

After:

class Car
  attr_accessor :engine

  def initialize(engine)
    @engine = engine
  end

  def gas
    engine.go!
  end

  def brake
    engine.stop!
  end
end

This gains you testing ease:

describe Car do
  let(:engine) do
    Class.new do

      def go!
        @has_gone = true
      end

      def stop!
        @has_stopped = true
      end

      def has_gone?
        @has_gone
      end

      def has_stopped?
        @has_stopped
      end

    end.new
  end

  subject { Car.new(engine) }

  it "goes" do
    car.gas
    engine.should have_gone
  end

  it "stops" do
    car.brake
    engine.should have_stopped
  end
end

… and flexibility:

yuppie = Person.new(:car => Car.new(HybridEngine.new(:diesel => 40, :electricity => 60))
bro = Person.new(:car => Car.new(V12Engine.new(:mpg => 2))

Y’all

Now you get to use this. Things you can do:

  • Write a Gem that provides a custom Paperclip interpolator. Encrypt file names, handle uppercase/lowercase issues, offer the time of day, offer a random number, and so on—anything you need!
  • Write a gem that provides a custom Paperclip geometry sizer. It scales and crops images differently. For example, pick out people’s faces, crop from the center, measure hard-to-measure images, and so on.
  • While refactoring your code use a dependency injection principle somewhere. Instead of Net::HTTP use a passed-in object; same for any credit card processing, Tweeting, file-writing, logging, and so on.
It is a bad idea to do a switch based on an attribute of another object. If you must, it should be on your own data, not someone else’s. - Refactoring by Martin Fowler

We’re reading Refactoring as part of an internal book club. This quote reminds me of the Feature Envy code smell.

class Cart
  def price
    @item.price + @item.tax
  end
end

If most of the work in a method is being done on another object (@item), the method should be on that other object.

If you haven’t played with Kevin Rutherford’s Reek gem before, it’s a neat way to find smells like Feature Envy in your Ruby code.