Writing Better Cucumber Scenarios; or, Why We're Deprecating FactoryBot's Cucumber Steps

Josh Clayton

For almost three years now, FactoryBot1 has provided developers using Cucumber a handful of step definitions that make it easy to create records in the database. Here are a few examples:

Given a post exists with a title of "Blog post about FactoryBot"
Given 5 posts exist
Given the following posts exist:
  | title                 | author                     |
  | FactoryBot is great! | email: person1@example.com |
  | Ruby is wonderful!    | email: person2@example.com |

While these steps are very easy to use, it requires a large amount of knowledge about each model in the step definitions. Your steps now know about the model name (Post), its title attribute, one of its associations (author), author has an email attribute… and Post is a very simple, straightforward example. Imagine a more complex domain and the amount of coupling that comes from it when using FactoryBot’s step definitions!

Mike recently wrote about writing steps from the user’s perspective and FactoryBot’s step definitions fall into the category of knowing too much about implementation of the application; it doesn’t care nearly as much about abstractions, and that’s a problem.

As of FactoryBot 3.5.0, using any of FactoryBot’s generated step definitions will print out a deprecation warning. We’ll be removing the step definitions completely in the 4.0.0 release of FactoryBot in accordance with SemVer. I imagine the existing code will be extracted to a gem similar to Cucumber Rails’ training wheels with a nice warning urging developers not to use the the steps.

Here’s an example scenario using the step definitions and it rewritten differently to demonstrate the benefits:

# With step definitions
Scenario: Complete all incomplete todos
  Given the following todos exist:
    | title        | author                    | complete |
    | Pick up milk | email: person@example.com | false    |
    | Pick up eggs | email: person@example.com | false    |
  And I have signed in as "person@example.com"
  When I complete the todo "Pick up milk"
  And I complete the todo "Pick up eggs"
  Then I should have no incomplete todos

versus

# Without step definitions, but with added clarity
Scenario: Complete all incomplete todos
  Given I have signed in
  And I have 2 incomplete todos
  When I complete all my incomplete todos
  Then I should have no incomplete todos

There are a few differences here. First, because I’m able to title my todos, there’s a bigger tendency to write a step that refers to that title and complete it. Conceptually, though, completing each one individually and completing all incomplete are the same. Second, the sign-in flow is placed before the existence of two incomplete todos. As a stakeholder I wouldn’t think, “Well, in order to see the data, it should exist first!”; it’s more likely the thought process of a developer who’s familiar with the four-phase test. Third, and as I mentioned above, the second scenario now knows nothing about the objects and attributes explicitly, instead dealing with concepts: “signing in” (not caring that a user signs in with an email address), “incomplete todos” (instead of caring about a complete flag on todos), “todos” (instead of the actual model, which could be named TodoItem, Todo, or Item).

Notice that the differences here are similar to the changes made to a scenario relying heavily on Cucumber Rails’ old web_steps.rb (taken from Aslak’s post on the reasons for removing Cucumber Rails’ training wheels):

Scenario: Successful login
  Given a user "Aslak" with password "xyz"
  And I am on the login page
  And I fill in "User name" with "Aslak"
  And I fill in "Password" with "xyz"
  When I press "Log in"
  Then I should see "Welcome, Aslak"

versus

Scenario: User is greeted upon login
  Given the user "Aslak" has an account
  When he logs in
  Then he should see "Welcome, Aslak"

The first scenario has lots of extra information: password, form labels, button text, current browser location. Gross. The second is much more straightforward, describing concepts instead of page details.

Actively developing a popular testing gem gives thoughtbot an acute advantage of being able to influence the Ruby testing community, for which I’m honored. As such, it’s our duty to urge developers to write better software with the help of better tests, and FactoryBot removing these step definitions is a very real win. Cucumber is meant to describe behavior through interface, not the intracacies of a database; by removing these automatic steps, less brittle steps will be written in their place.

Project name history can be found here.


  1. Looking for FactoryGirl? The library was renamed in 2017.