ActiveRecord Eager Loading with Query Objects and Decorators

Joe Ferris

We recently came across an interesting problem, which was discussed in a previous post, Postgres Window Functions:

We want to get each post’s three most recent comments.

As discussed, you can’t use simple eager loading:

Post.order(created_at: :desc).limit(5).includes(:comments)

This will load every comment for each post. When there are many comments per post, this quickly becomes unacceptable.

Starting Slow

It’s frequently easiest to start with a slow implementation and make it faster when necessary. In many cases, the slower (and possibly simpler) implementation will work just fine, and it’s best to deploy it as-is. Let’s look at a naive implementation of our list of posts and comments:

class PostsController < ApplicationController
  def index
    @posts = Post.order(created_at: :desc).limit(5)
  end
end
class Post < ActiveRecord::Base
  has_many :comments, dependent: :destroy

  def latest_comments
    comments.order(created_at: :desc).limit(3)
  end
end

This will frequently do fine, but causes N+1 queries, which will look like this in your log:

    Started GET "/" for 127.0.0.1 at 2014-09-18 11:36:18 -0400
    Processing by PostsController#index as HTML
      Post Load (0.4ms)  SELECT "posts".* FROM "posts"
        ORDER BY "posts"."created_at" DESC LIMIT 5
      Comment Load (0.2ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 27]]
      Comment Load (0.3ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 28]]
      Comment Load (0.2ms)  SELECT "comments".* FROM "comments"
        WHERE "comments"."post_id" = $1
        ORDER BY "comments"."created_at" DESC
        LIMIT 3  [["post_id", 29]]
      ...

If you’re using New Relic like we do, you’ll know a slow transaction has an N+1 problem when it shows many queries to the same model or table in the transaction log. Once performance starts to suffer, you’ll want to consolidate those queries.

In the previous post, we described how you could use a Postgres Window Function to find the comments you want in one query:

SELECT * FROM (
  SELECT comments.*, dense_rank() OVER (
    PARTITION BY comments.post_id
    ORDER BY comments.created_at DESC
  ) AS comment_rank
) AS ranked_comments
WHERE comment_rank < 4;

However, how can we plug this query into ActiveRecord such that we can use the data in our views?

It’s actually fairly easy. You need two new objects: a Query Object and a Decorator. Let’s refactor to introduce these objects, and then we’ll plug in our query.

The Query Object

We can perform the Extract Class refactoring and create a Feed model:

class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
  end

  def posts
    @posts
  end
end
class PostsController < ApplicationController
  def index
    @feed = Feed.new(posts: Post.all)
  end
end

The Decorator

We can use SimpleDelegator to create a quick decorator class for Post:

class PostWithLatestComments < SimpleDelegator
  def latest_comments
    comments.order(created_at: :desc).limit(3)
  end
end

We can apply this decorator to each Post in the Feed:

class Feed
  def posts
    @posts.map { |post| PostWithLatestComments.new(post) }
  end
end

The SQL Query

At this point, we’ve done nothing except to introduce two new classes to our system. However, we’ve provided ourselves an opportunity.

We frequently use Query Objects to wrap up complex SQL without polluting models. In addition to encapsulating SQL, though, they can also hold context, empowering objects to remember the query from whence they came. We’ll use this property of query objects to plug our SQL into our application.

First, we’ll use the above SQL query to find the comments relevant to our posts:

# feed.rb

def comments
  Comment.
    select("*").
    from(Arel.sql("(#{ranked_comments_query}) AS ranked_comments")).
    where("comment_rank <= 3")
end

def ranked_comments_query
  Comment.where(post_id: @posts.map(&:id)).select(<<-SQL).to_sql
    comments.*,
    dense_rank() OVER (
      PARTITION BY comments.post_id
      ORDER BY comments.created_at DESC
    ) AS comment_rank
  SQL
end

Then, we’ll group those comments by post_id into a Hash:

class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
    @comment_cache = build_comment_cache
  end

  # ...

  private

  def build_comment_cache
    comments.group_by(&:post_id)
  end

  # ...
end

Now, we pass that Hash to our decorator:

# feed.rb

def posts
  @posts.map { |post| PostWithLatestComments.new(post, @comment_cache) }
end
class PostWithLatestComments < SimpleDelegator
  def initialize(post, comments_by_post_id)
    super(post)
    @comments_by_post_id = comments_by_post_id
  end

  def latest_comments
    @comments_by_post_id[id] || []
  end
end

The Result

Our Feed class is now smart enough to perform two SQL queries:

  • One query to posts to find the posts we care about.
  • One query to comments (using Postgres Window Functions) to find the latest three comments for each post.

It then decorates each post, providing the preloaded Hash of comments to the decorator. This allows the decorated posts to find their latest three comments without performing an additional query.

The finished Feed class looks like this:

class Feed
  def initialize(posts:)
    @posts = posts.order(created_at: :desc).limit(5)
    @comment_cache = build_comment_cache
  end

  def posts
    @posts.map { |post| PostWithLatestComments.new(post, @comment_cache) }
  end

  private

  def build_comment_cache
    comments.group_by(&:post_id)
  end

  def comments
    Comment.
      select("*").
      from(Arel.sql("(#{ranked_comments_query}) AS ranked_comments")).
      where("comment_rank <= 3")
  end

  def ranked_comments_query
    Comment.where(post_id: @posts.map(&:id)).select(<<-SQL).to_sql
      comments.*,
      dense_rank() OVER (
        PARTITION BY comments.post_id
        ORDER BY comments.created_at DESC
      ) AS comment_rank
    SQL
  end
end

As you can see, most of the logic is concerned with generating that SQL query, and the machinery for plugging the results into our ActiveRecord models is very lightweight.

At this point, requests in our log look something like this:

Started GET "/" for 127.0.0.1 at 2014-09-18 13:53:39 -0400
Processing by PostsController#index as HTML
  Post Load (0.4ms)  SELECT "posts".* FROM "posts"
    ORDER BY "posts"."created_at" DESC
    LIMIT 5
  Comment Load (0.5ms)  SELECT * FROM (
    SELECT comments.*,
      dense_rank() OVER (
        PARTITION BY comments.post_id
        ORDER BY comments.created_at DESC
      ) AS comment_rank
    FROM "comments"
    WHERE "comments"."post_id" IN (154, 153)
    ) AS ranked_comments WHERE (comment_rank <= 3)

You can use this approach for many situations where it’s difficult to use simple eager loading.

What’s next

If you’re looking for more ActiveRecord magic, be sure to check out our Advanced ActiveRecord Querying course on Upcase, or learn how the query in this post works by reading about Postgres Window Functions.