GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS

Written by thoughtbot

Fast JSON APIs in Rails with Key-Based Caches and ActiveModel::Serializers

Want to make your Rails JSON APIs fast? Blisteringly fast? In a project I’ve been working on recently, I reduced requests from 5 seconds (or more!) to at most 0.5 seconds by using ActiveModel::Serializers and partial object caching with object composition. Before I explain how, I’ll need to explain why.

Partial Cacheability

Imagine a Rails app that stores a handful of locations in a database:

class Location < ActiveRecord::Base
  has_many :images
  has_many :categories

  geocoded_by :address

  after_validation :geocode
end

In this example, the Location model is using the geocoder gem to automatically geocode the model’s address. To find all locations within 15 miles of Boston:

Location.near('Boston', 15)

This returns an ActiveRecord::Relation with each location decorated with two pieces of additional data: distance and bearing. This data is dependent upon the point of interest searched; if I were to change my query from Boston to Cambridge (just across the Charles River), the distances and bearings would be affected even though the location data from the database would be the same.

The ActiveModel::Serializer for this model looks like this:

class LocationSerializer < ActiveModel::Serializer
  attributes :id, :name, :description
  attributes :street_one, :city, :state, :postal_code, :country, :latitude, :longitude
  attributes :distance

  has_many :images
  has_many :categories
end

ActiveModel::Serializers recently (as of 0.8.0) introduced a caching mechanism on serializers:

class MySerializer < ActiveModel::Serializer
  cached
  delegate :cache_key, to: :object
end

This allows serializers to use Rails.cache behind the scenes to cache generated JSON. However, enabling caching on the LocationSerializer doesn’t make sense because distance changes based on the search term. Finding locations within 15 miles of Boston would cache each relative to that search string; searching in Cambridge would return the previously cached results from Boston meaning most distances would be invalid.

Serializer Composition

The solution is figuring out how to cache some of the location JSON generated but not all of it (omitting distance, of course). Object composition is the perfect solution; by writing a brand new serializer that combines two other serializers (one which caches the location data from the database and one which adds distance), we can rely on the cache for the heavy lifting of the JSON and the decoration of distance for every search:

class SearchSerializer < ActiveModel::Serializer
  def serializable_hash
    location_serializer_hash.merge distance_serializer_hash
  end

  private

  def location_serializer_hash
    LocationSerializer.new(object, options).serializable_hash
  end

  def distance_serializer_hash
    DistanceSerializer.new(object, options).serializable_hash
  end
end

This composite serializer merges our existing serializer with a new DistanceSerializer. ActiveModel::Serializer#serializable_hash is the method which contains all the logic for #to_json so overriding #serializable_hash on the SearchSerializer will impact #to_json correctly.

The modified serializers look like this:

class LocationSerializer < ActiveModel::Serializer
  cached
  delegate :cache_key, to: :object

  attributes :id, :name, :description
  attributes :street_one, :city, :state, :postal_code, :country, :latitude, :longitude
  # attributes :distance <= removed

  has_many :images
  has_many :categories
end

class DistanceSerializer < ActiveModel::Serializer
  attributes :distance
end

To use a custom serializer when operating on an array, ActiveModel::ArraySerializer provides an each_serializer option:

ActiveModel::ArraySerializer.new(@locations, each_serializer: SearchSerializer)

Finally, we’ll want to cache all the other has_many results from the LocationSerializer by creating serializers for each of the models:

class ImageSerializer < ActiveModel::Serializer
  cached
  delegate :cache_key, to: :object

  attributes :image

  def image
    object.url
  end
end

class CategorySerializer < ActiveModel::Serializer
  cached
  delegate :cache_key, to: :object

  attributes :display_in_search_results, :display_in_details, :name, :class_name

  def class_name
    object.name.parameterize
  end
end

By enabling caching the other related pieces of data, we’re effectively enabling key-based cache expiration due to ActiveModel::Serializers’s behavior under the covers. For static data, this is incredibly effective because there will rarely be a cache miss after the cache is warmed.

Warming the Cache

While caching location data helps recurring searches (dependent on locations displayed and not by the search point of interest), it doesn’t improve responses for searches where the locations haven’t been cached. Depending on the dataset, warming the cache may be the best option. While this makes plenty of assumptions about usage data, it worked well for the app I worked on:

big_cities = ['New York, NY', 'Boston, MA', 'San Francisco, CA'] # long list of cities

big_cities.each do |city|
  Location.near(city, 50).each do |location|
    LocationSerializer.new(location).serializable_hash
  end
end

This warms the cache with the latest data and can be thrown in a rake task.

Results

Overall, this resulted in between a 5x and 10x speed improvement without any sort of database query tuning. Warming the cache still takes a long time, but I’ve got some ideas for how to speed up database queries after watching a screencast about improving performance from Joe Ferris.

What’s next?

If you found this useful, you might also enjoy: