Writing a Domain-Specific Language in Ruby

Gabe Berke-Williams

A Domain-Specific Language, or DSL, is “a programming language of limited expressiveness focused on a particular domain”. It makes tasks in its domain easier by removing extraneous code for a particular task and allowing you to focus on the specific task at hand. It also helps other people read the code, because the purpose of the code is so clear.

Here’s a little Internal DSL in Ruby:

Smokestack.define do
  factory User do
    name "Gabe BW"
    pet_name "Toto"
  end
end

user = Smokestack.build(User)
puts user.name == 'Gabe BW'  # true
puts user.pet_name == 'Toto' # true

other_user = Smokestack.build(User, name: "Bob")
puts other_user.name == 'Bob'      # true
puts other_user.pet_name == 'Toto' # true

Let’s go over each part and how it works.

Set up

First, let’s define User and Post classes that we’ll be using in our factory:

class User
  attr_accessor :name, :pet_name
end

class Post
end

We’ll be referring to these throughout the post.

Identifying method calls

When I first looked at code like Smokestack.define above, I had a hard time identifying where methods were being called. Let’s add parentheses:

Smokestack.define do
  factory(User) do
    name("Gabe B-W")
    pet_name("Toto")
  end
end

That’s better. Now we can see that Smokestack.define takes a block, and the factory method takes a class, like User, and a block. But where is the factory method coming from?

instance_eval

To find out, we need to look at the instance_eval instance method, available on every class. According to the docs:

Evaluates a string containing Ruby source code, or the given block, within the
context of the receiver (obj). In order to set the context, the variable self is
set to obj while the code is executing, giving the code access to obj’s instance
variables.

Here’s an example:

class DefinitionProxy
  def factory(factory_class)
    puts "OK, defining a #{factory_class} factory."
  end
end

definition_proxy = DefinitionProxy.new
definition_proxy.instance_eval do
  factory User
  factory Post
end

That code prints out this:

OK, defining a User factory.
OK, defining a Post factory.

The factory User and factory Post are evaluated in the context of the definition_proxy instance. That means that factory User in the definition_proxy.instance_eval block is actually calling definition_proxy.factory(User).

Now let’s add it to Smokestack:

module Smokestack
  def self.define(&block)
    definition_proxy = DefinitionProxy.new
    definition_proxy.instance_eval(&block)
  end
end

Smokestack.define is now the entry point for our DSL. It takes a block, then evaluates that block in the context of a DefinitionProxy instance.

Registering Factories

Now that our code sort of works, let’s register factories. Factories aren’t helpful unless we can refer to them, and to do that we need a central registry. The simplest registry is a hash that maps factory classes to factories.

The following code gives us Smokestack.registry:

module Smokestack
  @registry = {}

  def self.registry
    @registry
  end
end

Let’s change the factory method to register factories when they’re declared:

class DefinitionProxy
  def factory(factory_class)
    factory = lambda { puts "OK, creating a #{factory_class}." }
    Smokestack.registry[factory_class] = factory
  end
end

Instead of printing out a message immediately, we wrap it in a lambda. This means that we can store that lambda in the registry, and call factories whenever we like after registering them:

Smokestack.define do
  factory User
end

Smokestack.registry[User].call # => "OK, creating a User."
Smokestack.registry[User].call # => "OK, creating a User."

Diving Deeper

Now we can declare that a factory exists, but it’s not really a factory yet. It doesn’t actually initialize objects. Let’s look at the original code again:

factory User do
  name "Gabe BW"
  pet_name "Toto"
end

We want that code to declare a factory that does the following:

user = User.new
user.name = "Gabe BW"
user.pet_name = "Toto"
return user

Like Factory Bot, we’ll assume that:

  • factory User refers to the User class
  • The User class has setter methods for every attribute in the factory (e.g. name=)

Each factory might have different setter methods (e.g. pet_name=), so we’ll use method_missing to handle every case.

Let’s take a stab at it:

class Factory < BasicObject
  def initialize
    @attributes = {}
  end

  attr_reader :attributes

  def method_missing(name, *args, &block)
    attributes[name] = args[0]
  end
end

class DefinitionProxy
  def factory(factory_class, &block)
    factory = Factory.new
    if block_given?
      factory.instance_eval(&block)
    end
    Smokestack.registry[factory_class] = factory
  end
end

DefinitionProxy#factory now passes its block to a Factory instance, then stores the Factory instance in the registry. If there’s no block (i.e. if block_given? evaluates to false), then the factory does not evaluate the block, but is still put in the registry. Here’s a case where we have no block, because Post has no attributes:

Smokestack.define do
  factory Post
end

Factory inherits from BasicObject, which is a class with very few methods defined. It’s great for use in metaprogramming like this, where you want every instance method call to trigger method_missing.

We now have all of the Smokestack.define DSL working, with factories and a registry to map factory names to stored factories. Let’s add Smokestack.build.

Smokestack.build

Smokestack.build(User) needs to do the following:

  • Grab the user factory
  • Set attributes on the user, with optional overrides
  • Return the user

To get the attributes, we can grab factory.attributes from the user factory in the registry. Let’s have Smokestack.build take an optional second parameter, overrides, which allows people to pass custom values.

Here’s the code:

module Smokestack
  def self.build(factory_class, overrides = {})
    instance = factory_class.new

    # Set attributes on the user
    factory = registry[factory_class]
    attributes = factory.attributes.merge(overrides)
    attributes.each do |attribute_name, value|
      instance.send("#{attribute_name}=", value)
    end

    # Return the user
    instance
  end
end

Putting it all together

Here’s what we ended up with:

module Smokestack
  @registry = {}

  def self.registry
    @registry
  end

  def self.define(&block)
    definition_proxy = DefinitionProxy.new
    definition_proxy.instance_eval(&block)
  end

  def self.build(factory_class, overrides = {})
    instance = factory_class.new
    factory = registry[factory_class]
    attributes = factory.attributes.merge(overrides)
    attributes.each do |attribute_name, value|
      instance.send("#{attribute_name}=", value)
    end
    instance
  end
end

class DefinitionProxy
  def factory(factory_class, &block)
    factory = Factory.new
    factory.instance_eval(&block)
    Smokestack.registry[factory_class] = factory
  end
end

class Factory < BasicObject
  def initialize
    @attributes = {}
  end

  attr_reader :attributes

  def method_missing(name, *args, &block)
    @attributes[name] = args[0]
  end
end

Let’s run it through its paces:

Smokestack.define do
  factory User do
    name "Gabe BW"
    pet_name "Toto"
  end
end

user = Smokestack.build(User)
puts user.name
#=> "Gabe BW"
puts user.pet_name
#=> "Toto"

other_user = Smokestack.build(User, name: 'Bob')
puts other_user.name
#=> "Bob"
puts other_user.pet_name
#=> "Toto"

Neat.

How could this be made better

The use of a class directly (factory User) looks out-of-place if you’re familiar with Rails idioms or Factory Bot. To make this more Rails-y, we’d use a symbol like factory :user. To make this work, we can use plain Ruby or ActiveSupport:

# Plain Ruby
Object.const_get(:user.capitalize) #=> User

# ActiveSupport
:user.capitalize.constantize #=> User

Further Reading

For more examples of DSLs, check out factory_bot or the Rails routes.


Note:

Looking for FactoryGirl? The library was renamed in 2017. Project name history can be found here.