Deploying cron to Heroku is really…pleasant. Click “Daily” or “Hourly” cron and without any tedious setup of scripts, ensuring output is logged, or referring to CronWTF. Testing it though, is a pain! No longer should this be the case.
Outside of Heroku, splitting up daily and hourly cron tasks is easy: just have a script/cron_hourly and script/cron_daily in your Rails app, and have fun configuring that on your server. On Heroku, it’s handled in one Rake task. Here’s an example from the Dev Center:
desc "This task is called by the Heroku cron add-on"
task :cron => :environment do
if Time.now.hour % 4 == 0 # run every four hours
puts "Updating feed..."
NewsFeed.update
puts "done."
end
if Time.now.hour == 0 # run at midnight
User.send_reminders
end
end
Two things stand out here. First, checking for hourly/daily tasks is done by looking at Time.now. Second, that’s a lot of logic to put in a Rake task, and not write a test!
We’ve talked about testing Rake integration before, and we’re going to use a similar pattern here: extract the Rake task into a model. For Radish, our Rakefile now has:
desc "Run cron job"
task :cron => :environment do
Cron.run
end
Our Cron class now has to handle the daily and hourly tasks. For now, this is all done in the run class level method. For Radish, the method has to:
I ended up just using RSpec to test this out. Timecop helps with freezing time in the right place, and Bourne gives us test spies to make sure the right methods get called. Here’s the test I ended up with:
require 'spec_helper'
describe Cron do
before do
Account.stubs(:activate)
Archive.stubs(:store)
end
let!(:project1) { Factory(:project) }
let!(:project2) { Factory(:project) }
after do
Timecop.return
end
it "runs nightly" do
Timecop.freeze(Time.now.midnight)
Cron.run
Account.should have_received(:activate)
Archive.should_not have_received(:store)
end
it "runs hourly" do
now = Time.now.midnight + 1.hour
Timecop.freeze(now)
Cron.run
Account.should_not have_received(:activate)
Archive.should have_received(:store).with(project1, now, now - 1.hour)
Archive.should have_received(:store).with(project2, now, now - 1.hour)
end
end
Let’s start from the top of the test here. The before block uses Bourne to stub out the two class level methods on other models in the application, so we can assert they were called later. We then hook up two let! blocks for two projects, which we will use later. let! as opposed to let in RSpec will force those blocks to be evaluated for each test run instead of being lazy evaluated when they are referenced. Finally, since we’re going to freeze time for each test, we have to return to the system time in the after block.
Our two tests verify our daily and hourly scenarios. The first freezes time at midnight, which may not be exactly when Heroku runs our cron job, but all we care about is that the daily task runs only once. The hourly test freezes time at 1:00AM and checks that only the hourly task gets run, and not the daily.
I could have gone a little more gung-ho on this test, perhaps running through an entire 24 hours and making sure the daily task was only called once, but this was good enough.
Here’s what I ended up with in my Cron model:
class Cron
def self.run
now = Time.now
Project.find_each do |project|
Archive.store(project, now, now - 1.hour)
end
if now.hour == 0
Account.activate
end
end
end
The implementation ended up to be pretty simple: grab the time, archive always (since the task is run hourly), and if it’s run in the 12:00AM hour, activate accounts.
Pushing this code down into a model makes more sense now…too much code that deals with models and not test data or factories in a Rake task always smells a bit funky to me. It’s also much easier to refactor the code now that we’re in a real model and we have a testing feedback loop in place. For instance, the Project.find_each loop could easily be extracted into the Project class.
During this process I learned of a UNIX trick that can help with testing this locally: the TZ flag. The appropriately named UNIX Power Tools puts it best:
The TZ environment variable is a little obscure, but it can be very useful. It tells UNIX what time zone you’re in.
Most of the time scripts will get this from your environment, but you can override it. Here’s a simple way to test this:
% TZ=UTC+5 ruby -e "puts Time.now"
2011-07-05 11:30:48 -0500
% TZ=UTC-8 ruby -e "puts Time.now"
2011-07-06 00:30:42 +0800
% TZ=UTC ruby -e "puts Time.now"
2011-07-05 16:30:53 +0000
So basically, if you want to force it to be midnight when running a test or from a small script, you can use this environment variable to add/subtract time from your current time zone.
Testing cron is now actually feasible, and now you can be assured your task will work without waiting an entire hour or day to find out. Which of course, means you can ship it faster!
Redis is a key/value store, but it’s jam-packed with a ton of other little utilities that make it a joy to explore and implement. Two of these are the PUBLISH and SUBSCRIBE commands, which enable you to do quick messaging and communication between processes. Granted, there’s plenty of other messaging systems out there (AMQP and ØMQ come to mind), but Redis is worth a look too.
The way it works is simple:
Those two commands are all you need to build a messaging system with Redis. But, what should you build?
I tend to build quick little games to learn new ideas, frameworks, languages, etc. For Redis pub/sub I chose to emulate IRC, since “channels” are essentially the same concept for an IRC server. A user connects, talks into a channel, and if others are there, they get the message. This is the basic concept, we’re not going to re-implement the IRC protocol here.
I scraped together two tiny little Ruby scripts for this. The repo is on GitHub, if you want to play with it. Make sure you’re running Redis locally first! Install Redis via redis.io if not.
First up, pub.rb publishes messages to a channel. We’re going to bring in the Redis gem to use for making a client, and JSON in order to have an easy transport format. We could have used Ruby’s Marshal class instead to serialize, but this works fine and is human readable.
# usage:
# ruby pub.rb channel username
require 'rubygems'
require 'redis'
require 'json'
$redis = Redis.new
data = {"user" => ARGV[1]}
loop do
msg = STDIN.gets
$redis.publish ARGV[0], data.merge('msg' => msg.strip).to_json
end
This script will run interactively, once provided a channel and username from the command line arguments. It then fires off the PUBLISH command every time the user types a message and hits Enter. It publishes the message to a channel (in ARGV[0], the first command line argument.
So if we were to run:
% ruby pub.rb rubyonrails qrush
Hello world
Our Redis client would then send a PUBLISH command down the “rubyonrails” channel with the given message. The message itself is JSON and looks like:
{
"msg": "Hello world",
"user": "qrush"
}
We can actually verify this with the MONITOR command, which will spit out all commands the Redis server has processed. If we had MONITOR running before sending the above hello world snippet, it shows:
% redis-cli
redis> MONITOR
OK
1306462616.036890 "MONITOR"
1306462620.482914 "publish" "rubyonrails" "{\"user\":\"qrush\",\"msg\":\"Hello world\"}"
Currently this simple script doesn’t support publishing under more than one channel. Opening up more than one pub.rb process will let you do that.
Woot! Now that we have messages being sent we have to listen to them. Enter sub.rb:
require 'rubygems'
require 'redis'
require 'json'
$redis = Redis.new(:timeout => 0)
$redis.subscribe('rubyonrails', 'ruby-lang') do |on|
on.message do |channel, msg|
data = JSON.parse(msg)
puts "##{channel} - [#{data['user']}]: #{data['msg']}"
end
end
Once again we’ll need Redis and JSON to connect and parse messages. The initialization process for Redis is different this time: it’s using a new :timeout option. This will force the Redis client to never timeout when waiting a response, so we’ll wait forever for messages to come in. Perfect!
This script subscribes to two different channels: rubyonrails and ruby-lang. Basically, once the interpreter reaches the subscribe block, it will never exit and continue to wait for messages.
When a message comes in, the message block is fired, yielding two arguments: the channel the message was on, and the actual data sent down the pipe. Parsing that JSON chunk then allows us to spit out who said it, where they said it, and what was actually said. Here’s what it looks like if some other clients are publishing messages after we run our sub.rb file. (The published messages arrive in the order they are sent, but that’s hard to display in text):
% ruby pub.rb rubyonrails qrush
Whoa!
`rake routes` right?
% ruby pub.rb rubyonrails turbage
How do I list routes?
Oh, duh. thanks bro.
% ruby pub.rb ruby-lang qrush
I think it's Array#include? you really want.
% ruby sub.rb
#rubyonrails - [qrush]: Whoa!
#rubyonrails - [turbage]: How do I list routes?
#ruby-lang - [qrush]: I think it's Array#include? you really want.
#rubyonrails - [qrush]: `rake routes` right?
#rubyonrails - [turbage]: Oh, duh. thanks bro.
Whoa! How does it work!? Under the hood in the redis-rb client, the subscribe block is actually stuck in a Ruby loop (called from Redis::SubscribedClient#subscription). The client is going to continually attempt to read from the socket for messages, until there’s an error of some kind (but not a Timeout!). The redis-server then keeps a list of channels and patterns for each connected client, and publishes messages to them when a PUBLISH command is sent.
Although this is a simple example of how Redis pub/sub works, it’s pretty cool to see what others have done with it. Some examples include Convore, which is used as a pretty central part of their infrastructure, and Realie, a real-time code editor like Etherpad. Another good link to check out is Salvatore Sanfilippo’s recent interview on The Changelog (around ~24:00-27:00) where he discusses that developers are switching from other MQs to Redis due to its simplicity and performance.
If you’re using Redis’ Pub/Sub within your infrastructure, we’d love to hear your feedback on how Radish can provide more visibility to what your messaging system is up to.
It’s no secret that we’re big fans of Redis here at thoughtbot. Redis has great data structures, plenty of fantastic libraries for interacting with it, and is plain fast. Several questions kept arising though every time we deployed Redis to Hoptoad and several client projects: What’s actually going on inside of Redis? What data is being stored, how much, and how fast?
Our answer to these questions is a new Redis analysis and monitoring service, Radish.
Radish works side-by-side with your existing Redis instances in your hosting environment. All you need to do is start a daemon which handles connecting to Redis and sending data out to Radish over HTTPS.
So, what can Radish do?
Picking out spikes in Redis operations is helpful to correlate to web traffic, or lack thereof! We also split out the types of commands for reads vs. writes (GET vs SET, for instance). “Other” commands are usually admin/maintenance actions such as INFO.
We also track several stats about your Redis instance, such as average commands per second, memory usage, changes since last save, and keyspace size. A growing keyspace and memory size is a constant effort when maintaining a Redis instance, and Radish can help you track it and keep it to a manageable level. Tracking changes since last save has also helped us figure out that we don’t SAVE our Redis data enough, which could have dire effects if the Redis instance was to go down.
Breakdown of high-frequency commands, keys, and namespaces assist in determining what your instance is processing. Learn what keys are being written to and when. Highlight the areas of your code that might be wasting memory on your Redis instance. Radish makes it possible to show what might be blocking Redis from serving requests since it is single-threaded.
Radish is already helping us understand what our Redis instances are up to without a lot of custom scripts and temporary hacks. Sign up, get it running with your Redis instances, and let us know how it goes!
Huge thanks to my fellow robots for helping me getting this idea off the ground. Our beta testers also deserve kudos for hammering us with live traffic and exposing lots of fun bugs.