Refactor in a branch

Mike Burns

This blog post is about refactoring, git, long-lived branches, and scope creep. It has a lot to say, so here is a summary:

  • It is very acceptable for one ticket (story, todo, quickfix, …) to spawn multiple branches.
  • Refactor in a branch.

Three kinds of refactoring

There are three relevant kinds of refactorings: fixing large swaths of code to go from ugly to less ugly; the “refactor” step of “red, green, refactor”, where the thing you just wrote is improved, and; changing an existing architecture to better handle a new feature.

The first of these—improving code from ugly to elegant—is actually an example of a re-write, should probably be avoided, and definitely should live as its own entity. Working on this while working on another feature is an example of scope creep, and pretending that it is relevant to your feature is lying to yourself. This is not the blog post I want to write now, but stop doing this please.

The third step of “red, green, refactor” is an important one. The feature is not done if you skip this step. The whole point of “green” is to get to “refactor”. But it can be easy to lose sight while doing this step; it’s easy to start re-implementing irrelevant parts of the system. When going down this path it’s good to consider another option: back out the “green”—go back to the failing test—and make a change to the existing system in preparation for the implementation.

That brings us to the third kind of refactoring: re-architecting toward a goal. This is when you can use a classical best practice (as examples: composing instead of inheriting; replacing a global with an instance variable) to make your feature easier to implement. It’s the most fun kind of refactoring: you have a clear problem to solve with a secondary ideal of improving the codebase for all.

Don’t refactor with broken tests

Don’t run an experiment without a control. Don’t conflate correlation with causation. Don’t whiz on the electric fence. Don’t refactor with broken tests. Sure sure, we’ve heard it all before, but sometimes things come up!

For serious, cut it out. Instead, use this workflow.

  • Commit what you have in your feature branch, even if it’s failing. That’s fine; you’ll be back soon.
  • Make a new refactoring-specific branch, off master. This is not a branch off your branch, but instead yet another first-class branch.
  • Do your refactoring in this branch, in isolation from your feature branch. (It’s OK, even encouraged, to pull in support from your feature branch, such as RSpec matchers and Gemfile changes.)
  • Submit this small refactoring branch for code review as normal.
  • To continue work in your feature branch, rebase off the refactoring branch.

Let’s look at an example. In this example, we have a small gem that connects to a singleton database connection to fetch some data.

require 'sequel'

module Ikea
  DB = Sequel.connect(ENV['DATABASE_URL'])

  class Couch
    def all
      DB[:couches].all
    end
  end
end

The test is super chill:

require 'rspec'
require 'ikea/couch'

describe Ikea::Couch do
  it 'produces all the couches' do
    Ikea::DB[:couches].insert('BoConcept ripoff')

    Ikea::Couch.new.all.should == ['BoConcept ripoff']
  end
end

There is some setup involved, and the test is slow, but it passes and the code stays simple. Rocking. But now we need a new feature:

it 'produces the empty list if the DB is down'

We could mock the constant, but that’s annoying and problematic. We could redefine the DB constant, but that’s dumb. Alternatively, we could pass a database connection, just like the TDD gods intended. We’ll do that.

First step: commit what we have:

git checkout -b db-is-down
git commit -am wip

Then make a new branch off master:

git checkout master
git checkout -b no-singleton-db

Make the change:

module Ikea
  class Couch
    def initialize(db)
      @db = db
    end

    def all
      @db[:couches].all
    end
  end
end

require 'rspec'
require 'ikea/couch'

describe Ikea::Couch do
  it 'produces all the couches' do
    fake_table = double('couch table', all: ['BoConcept ripoff'])
    fake_db = double('db connection', :[] => fake_table)

    Ikea::Couch.new(fake_db).all.should == ['BoConcept ripoff']
  end
end

Submit for a code review:

git commit -am "Move from a singleton DB to a composed DB connection"
git push
hub pull-request

Go back to your feature, still in progress:

git checkout db-is-down
git rebase no-singleton-db

Why

This tiny refactoring branch may seem like a waste, even confusing. Multiple short-lived branches just to implement a single feature, asking your coworkers for a five-line code review, branching off master instead of your feature branch, rebasing. Rebasing!

But think of the alternatives: a single long-lived branch that sits in isolation, asking your coworkers for a long code review, branching off a branch (and perhaps branching off that!). And rebasing makes you feel like a badass, admit it.

As a bonus, these branches help keep you focused and on-scope. Make a branch for a tiny change, and you’ll be sure to make the tiny change. That’s all you have to focus on. No distracting features or broken tests, just that one refactoring.

It takes practice

This will seem slow at first. That’s OK, let it be slow. It, like TDD, will become second nature if you try.

As a useful trick, pair with someone who already does this. They’ll set you straight.

Go further

Many shell or git aliases can be built atop this. Try it out and create useful ones!