Blog in Markdown, Deploy with Webhooks

We recently rewrote the code that powers this blog. Previously, the blog ran as a Middleman app. The new system is tailored to our preferred authoring workflow (Markdown + GitHub) and takes advantage of webhooks to automate tasks that are not writing or reviewing a post.

Splitting content from engine

The idea to rebuild the blog stemmed from a conversation about publishing new blog posts. We love our process of writing posts in Markdown, versioning them via Git and reviewing them via GitHub pull requests. However, in our previous setup, we needed to redeploy the blog to Heroku whenever a new post was published. This was tedious and frustrating.

The ideal workflow would be to merge a pull request for a new blog post and have everything else happen automatically. A big obstacle to this goal was the coupling between the content of our blog and the application code that served it.

This led to the decision to break up our blog into two independent repositories. One would contain the blog engine, written in Rails, while the other would be strictly Markdown documents.

new blog
workflow

Setting up a GitHub webhook

GitHub allows you to subscribe to events on a repository via a webhook. You provide them with a URL and they will post to it every time the designated event occurs.

When a new post gets merged to the master branch of the content repository, we respond by kicking off an import.

github's webhook event
options

GitHub’s documentation for webhooks is pretty good. Check it out.

For security reasons, we want to restrict access to the webhook URL to only allow payloads from GitHub. GitHub allows you to set a secret key with which the incoming request is signed. If the request signature matches the payload hashed with the secret key, then we know the request is genuine.

Caching with Fastly

We host our blog on Heroku and use Fastly as our CDN. Jessie wrote a fantastic post on how to set up Fastly with a Rails application. We used this approach for the blog engine. When we import a new post, we purge the cache. However, this won’t work for posts that don’t show up immediately on the blog such as those with future dates. In addition, we run a daily task via heroku scheduler that purges the posts.

Initially we were confused by the Article#purge and Article.purge_all methods included into our models by the fastly-rails gem. Article#purge will expire all pages that have the surrogate key for that individual article while Article.purge_all will expire all pages that have the general article surrogate key.

Some pages have both, for example:

def index
  @articles = Article.recent
  set_surrogate_key_header Article.table_key, @articles.map(&:record_key)
end

This index page can be expired by calling Article.purge_all or by calling purge on any of the article objects rendered on that page.

So when should you use one over the other?

When creating a new object you want to use purge_all. This is a new object that isn’t on any page yet so purge wouldn’t do anything.
When updating an object, you can use purge. This will expire any pages that render that object.

Building a sitemap

Search engines like Google and Bing use XML sitemaps to generate search results. The Giant Robots sitemap allows us to inform search engines about the relative importance of each URL on the site and how often they change. The most popular gem we found for generating sitemaps, SitemapGenerator, generates static files and suggests setting up a cron job to update it periodically. We found that it wasn’t difficult to serve our own dynamic sitemap using the Builder templates that ship with Rails.

Authoring posts locally

While splitting the content from the engine simplified a lot of things, it did make previewing posts more difficult. Previously, an author could spin up a local Middleman server and preview their post exactly as it would show up on the blog. However, the new engine doesn’t read files from the local repo but imports them from GitHub instead. This would force authors to:

Set up a local version of the engine
Connect it to the GitHub repository’s webhook
Push to GitHub in order for GitHub to send the file back down to their local machine so the engine can render it.

This whole workflow is tedious. We considered using a standard Markdown editor such as Marked to preview the posts but then they wouldn’t be rendered using our stylesheet and layouts.

We decided to implement an author mode that would read Markdown files from the local file system rather than the database + GitHub. In order to do this, we built a set of lightweight objects that mimicked our ActiveRecord models. Local::Article, Local::Author, and Local::Tag. These objects are backed by the file system rather than the database.

To ensure the correct objects are called by the controller we added the following initializer:

if ENV.fetch("AUTHOR_MODE") == "true" && ENV["LOCAL_POSTS_PATH"].present?
  require "local/article"
  require "local/tag"
  require "local/author"

  Article = Local::Article
  Author = Local::Author
  Tag = Local::Tag
end

You would expect this would cause some “already initialized constant” errors if these redefine constants from our models or if the models get loaded after this initializer. However, this is not the case. Rails’ autoloading system will only load a model file if it finds an undefined constant named for that file. Since our constants are already defined, Rails will never load the models.

Conclusion

Since these changes went live, authoring blog posts is now much more streamlined. We get to focus on writing content using our favorite plain-text editor and getting feedback on GitHub. Once satisfied, we merge our post and it will automatically show up on the blog on the day it was dated for. Magical!