The One-Method Silent Killer in Your Cucumber Suite

Jason Morrison

There’s a single method in your Cucumber suite, right now, that could silently allow regressions and ambiguity to slip in without you ever knowing. Guess which one! Actually, bonus points to you if you scroll down to the comments and make a guess before finishing this post.

Did you guess? If not, here are a few more hints.

It uses regular expressions.

It’s a large cascading conditional statement.

It’s well-intentioned, and serves as a central lookup and naming convention for what would otherwise be magic constants.

“I’ll take ‘deprecated methods’ for a thousand, Alex.”

Ready for the reveal? It’s HtmlSelectorsHelpers#selector_for. Yikes! Now that I’ve named names, let’s play everyone’s favorite game, spot-the-regex-bug:

module HtmlSelectorsHelpers
  def selector_for(locator)
    case locator

    # snip a few lines

    when /the "([^"]+)" story/
      story = Story.find_by_title!($1)
      "##{dom_id(story)}"

    # snipped another 100+ lines

    when /^the related link for the "([^"]+)" story$/
      story = Story.find_by_title!($1)
      "a#some_selector_for_story_#{story.id}"
  end
end

Spot it? This one is fairly easy to find - a step definition like:

When I click on the related link for the "Make some bread" story

would match both regular expressions, because the first regular expression doesn’t match against the beginning ^ or end $ of the line. Since the case statement just evaluates from the top down, the first match wins.

How did I find that? I was adding a new piece of functionality to click on the related link, wrote that new selector, ran the test, and it failed. I wrote my passing implementation. Still failed. I tried it out in my browser, where things worked normally. It took some sleuthing to figure out that the wrong selector was being used, and where it was coming from.

Digging deeper

Some other issues are actually lurking somewhere in the 100+ snipped lines of regexes. Let’s say you add another selector like this:

when /^the comment button for the "([^"]+)" story$/
  # ...

If that is listed after the /the "([^"]+)" story/ matcher, it will never be matched. If you’re not careful, even if you are disciplined about reaching the red state in TDD while adding the new feature, this can lead to subtle regressions later down the line. Consider a link or piece of content that gets moved around on the page: if the broad regexp still matches it, you could end up with a false positive test.

So, what’s the general issue here? When I think about the set of selectors I implement, I believe it’s a design drawback to allow developers to add ambiguously matching selectors. It would be better if the selector_for method were to give you early notice of multiple colliding matches.

What might such an implementation look like? Well, if we used a construct other than the case statement, then selector_for could evaluate all possible matches, and raise an exception if there are multiple matches.

Consider this example:

def selector_for(locator)
  selectors = {
    "the page" => "html > body",

    /the project dropdown/ => '#project-selection ul',

    /the "([^"]+)" story/ => lambda {
      story = Story.find_by_title!($1)
      "##{dom_id(story)}"
    },

    /^the related link for the "([^"]+)" story$/ => lambda {
      story = Story.find_by_title!($1)
      "a#some_selector_for_story_#{story.id}"
    }
  }

  matches = selectors.map do |selector, result|
    if locator =~ selector
      if result.respond_to?(:call)
        result.call
      else
        result
      end
    end
  end.flatten

  raise "Ambiguous matches!" if matches.size > 1

  matches.first
end

(This is just an illustration. I’ve no idea what the performance impact is here, or if the above code works. You could smooth out the interface a bit, too.)

So - problem solved! We can identify ambiguous matches. Party hats all around.

But wait, there’s more

But is this the best we can do? The root of ambiguity here is regular expressions, and you, dear astute reader, have recognized that these selectors are being used in Cucumber steps, and not just within step definition implementations. Hopefully, you’re climbing into the commenting cockpit, ready to take aim at the notion of this coupling of front-end implementation to the big-A Acceptance test suite - a notion that has gotten so much blog heat lately.

In fact, selector_for was removed in cucumber-rails 1.1.0. For new apps, it’s a good change. And we should upgrade our existing apps, and generally move away from imperative steps if we’re using them. But is there still something to selector_for that’s worth keeping around?

Myself, I’ve liked using selector_for in the definitions of acceptance test steps because it provides a central and well-named lookup for front-end implementation details. But I don’t like-like it. Really, we’re trading one set of magic constants (CSS selectors) for another (named selector strings), and the presence of regexp-based dispatch in the land of plain old Ruby methods starts to feel a bit out of place. Unnecessary. Even bulky:

    When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
      click_link(selector_for("the related discussion link for #{story_title}"))
      fill_in 'Comment', with: "What fun discourse"
      click_button 'Create Discussion'
    end

Wouldn’t it be nice to just extract these to plain old Ruby methods?

module HtmlSelectorsHelpers
  def related_discussion_link(story_title)
    story = Story.find_by_title!(story_title)
    "a#some_selector_for_story_#{story.id}"
  end
end

World(HtmlSelectorsHelpers)

When /^I participate in a discussion related to "([^"]+)"$/ do |story_title|
  click_link(related_discussion_link(story_title))
  fill_in 'Comment', with: "What fun discourse"
  click_button 'Create Discussion'
end

Now, we have less clunky dispatch and more cohesive step definitions which operate at a single level of abstraction.

It’s also good because this moves away from encouraging the coupling of Cucumber steps to implementation details. It encourages less string matching, and more plain Ruby method-calling.

But what about our original issue, ambiguity? It’s still possible to overwrite a method in the HtmlSelectorsHelpers module, although I suppose it’s less likely - the real reason I got tripped up at the beginning was that too-general regular expression. I think we’ve done well to mitigate that source of bugs.

Bringing it on home

The one trouble with this approach is that it’s not incremental. I can’t easily convert all my existing selector_for code to use just methods in one fell swoop. This is the same pain point felt when talking about removing web_steps from an existing suite - it’s messy and time-intensive to change all that string-matching lookup into method dispatches. But, worth it? I’d argue that it is.

If you’re writing Cucumber, do you use selector_for, spurn it despite its availability, or use a new version without the method? Do you place your selectors directly in step definitions? Do you have some other lookup facility?