How Much Should I Refactor?

Joe Ferris

The Rails community has been abuzz with object-oriented programming, SOLID principles, laws, design patterns, and other principles, practices, and patterns. We’ve (re)discovered new tools and techniques to separate and reuse logic, making code easier to test, understand, and maintain. Now that we’ve learned about all these new tools, when do we use them?

Going too far

Applying design practices correctly takes a lot of practice, and when you first discover a hammer, everything looks like a nail. You can refactor every class ad-nauseum until each class obeys every letter of the SOLID principles, but that approach will hurt you: for one, not every class will ever see too much use or change, but for another, many of the abstractions advocated by those principles can hurt overall readability in the short term. Extracting a method means you need to jump around a file to figure out what’s going on. Extracting a class means you have to jump between files. Extracting a library means you need to jump between projects. Adding names can clarify what you’re doing, but adding too many names results in a vocabulary overload. Abstractions can grant you infinite flexibility but zero readability.

Getting a sour taste

If you’ve seen these principles applied incorrectly or overapplied, you may be tempted to throw the baby out with the bath water. Many Ruby developers came from Java, and after a few years of working with AbstractUserDecoratorFactoryFactories, some of them decided to never use the words “design pattern” again. Many principles have been applied with good intention and terrible results, but that’s no reason to throw out everything we’ve learned since the inception of object-oriented programming.

Applying the Dependency Inversion Principle in one situation may create an unreadable, abstract puzzle, but using it when you need it can keep a single class from ruining an entire application. Applying any principle as a black and white law will result in a game of refactoring whack-a-mole. However, if we can’t apply these principles universally, how do we know when to be aggressive about refactoring?

God classes

Every project has them: one or two classes that seem to know everything. Any question you could ask in the domain is answered by one of these classes. Any class you look into seems to depend on them. You can’t change any class in the system without breaking the tests for a god class. My experience has taught me that most projects will have two god classes: User and whatever the focus happens to be for that application. In a blog application, it will be User and Post. In an Agile project management tool, it will be User and Story. God classes grow and become entangled with every other component of the system until there’s no room to breath.

God classes are a great place to start aggressive refactoring. Without even looking inside their files, you can probably tell which two classes in your project are the culprits. I think very carefully before adding any behavior to these classes. I apply SOLID principles rigourously to these classes. If I can extract behavior from one of these classes using a design pattern like observer or decorator, I’ll do it.

Extracting behavior from a class like this doesn’t reduce overall comprehensibility, because the class is already too large to comprehend. Introducing abstractions to these classes makes them easier to understand, because you can at least fit all the abstractions in your head at once, even if you can’t remember what the implementations are like.

Churn

You can use tools like Churn to figure out which files in your project change the most. The top two contenders are likely to be your project’s god classes, but you may be surprised which other files change the most. If something changes once, it’s likely to change again, and locating the pieces of these classes that change and extracting them will make the next change faster. Refactor the classes that change the most by slimming them down and making them as readable as possible to increase productivity and help to avoid defects.

Bugs

If you can pinpoint a bug to a particular class, it’s likely that refactoring that class will help to prevent the next bug. Bugs love company, and the same parts of an application tend to break again and again. Extractions can help isolate the trickier components, and refactoring may reveal that the bugs are cropping up because you were thinking about the problem the wrong way to begin with. Keep track of which files you change when fixing bugs; these files are great targets for aggressive refactoring.

Crossing the finish line

When do you stop refactoring? When is a change good enough to commit? When is a branch good enough to deploy? When is a library good enough to release?

A program is never finished, and no amount of refactoring will make it perfect. I refactor to fix the parts of an application or library that have caused me pain. Rather than attacking theoretical problems, identify components that have actually bitten you and apply theory to tame them. Refactor as you go, and fix one problem at a time.