giant robots smashing into other giant robots

We are thoughtbot. We make web & mobile apps.

Tagged:

Comments (View)

I learned to alias shell commands with Hooked on Phonics

Do your co-workers snicker at your sentence-length shell aliases? I’m here to tell you it’s okay because my aliases border on the Dostoyevskian, too.

Some commands you do a thousand times a day. They deserve super-short aliases:

alias be="bundle exec"
alias s="bundle exec rspec"
alias cuc="bundle exec cucumber"

Like h() or t() in Rails, the more often you invoke a command, the more acceptable it becomes to have a short, cryptic alias.

However, there’s a class of commands I find myself running about 1-20 times a day: interacting with our staging and production environments on Heroku:

# Heroku staging
alias staging='heroku run console --remote staging'
alias staging-process='watch heroku ps --remote staging'
alias staging-releases='heroku releases --remote staging'
alias staging-tail='heroku logs --tail --remote staging'

# Heroku production
alias production='heroku run console --remote production'
alias production-process='watch bundle exec heroku ps --remote production'
alias production-releases='heroku releases --remote production'
alias production-tail='heroku logs --tail --remote production'

# Heroku databases
alias db-pull-staging='heroku db:pull --remote staging --confirm `basename $PWD`-staging'
alias db-pull-production='heroku db:pull --remote production --confirm `basename $PWD`-production'
alias db-copy-production-to-staging='heroku pgbackups:restore DATABASE `heroku pgbackups:url --remote production` --remote staging  --confirm `basename $PWD`-staging'
alias db-backup-production='heroku pgbackups:capture --remote production'
alias db-backups='heroku pgbackups --remote production'

Here’s where Neckbeard next to you silently pities your Fisher-Price programming style. What self-respecting programmer would type all that?!

Well, through the magic of autocompletion, you never type more than a few characters for each command. They might look goofy, but they’re memorable.

If you’d like these aliases and other goodies, they’re packaged up in our dotfiles.

Then, the next time you feel your pair programming partner smirking over your shoulder, just tell ‘em: “I learned to alias shell commands with Hooked on Phonics!”

Tagged:

Comments (View)

The most underrated Heroku feature?

It’s right there in the docs but I didn’t notice it until recently:

heroku pgbackups:restore DATABASE `heroku pgbackups:url --remote production` --remote staging

Boom! It transfers the production Postgres database to staging.

It’s much faster than db:pull, then db:push, which is what I used to do (like a sucker).

Setup:

git remote add staging git@heroku.com:my-staging-app.git
git remote add production git@heroku.com:my-production-app.git
heroku addons:add pgbackups --remote staging
heroku addons:add pgbackups --remote production

Create a database backup at any time:

heroku pgbackups:capture --remote production

View backups:

heroku pgbackups --remote production

Destroy a backup:

heroku pgbackups:destroy b003 --remote production

Tagged:

Comments (View)

7 minute ab: impatient man’s load tests for a Heroku app

We’ve been working with a client who recently launched a new service. The launch entailed their marketing team sending batches of emails to a 1 million+ person mailing list over 2 days. In the email, there’s a link to the homepage.

The client wanted some confidence that the home page of the Rails app, which is hosted on Heroku, would be able to handle the load generated from that traffic.

They didn’t need a heavy-duty load test, just a little assurance. In turn, I wanted something that was quick to set up and execute.

To repeat, I don’t consider this a rigorous load test. For that, look at something like Blitz.io or Tsung. This is a quick-and-dirty alternative.

Apache Bench

It doesn’t get quicker than apache bench:

man ab

The ab command I ended up with:

ab -n 50000 -c 50 -A user:password https://staging.ourapp.com/

50000 requests with 50 concurrent users. Basic auth is used on staging to keep the outside world from seeing the app before it’s unveiled. The trailing / is necessary.

I maxed out at 50 concurrent users because I read in Deploying Rails Applications by Ezra Zygmuntowicz that’s about the most that apache bench can reasonably simulate.

If I was testing a particular workflow, I may have used the -C flag with a session value grabbed from a browser. That way, every test would use the same session. For this scenario, however, I wanted to generate a new session on each request because I was testing many new users hitting the home page.

Heroku

To get more visibility into what was happening, I added a logging add-on:

heroku addons:upgrade logging:expanded --remote staging

While the tasks ran, I had a shell open tailing the log:

heroku logs -t --remote staging

It was mildly entertaining to watch the foreman-style logs fly by:

2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.9 queue=0 wait=0ms service=49ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.6]: 2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.6 queue=0 wait=0ms service=156ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 app[web.6]: 2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.2 queue=0 wait=0ms service=51ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.6]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -07002011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.15 queue=0 wait=0ms service=29ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.16 queue=0 wait=0ms service=58ms status=200 bytes=113222011-07-12T16:43:37+00:00 heroku[router]: GET dev.testkitchenschool.com/ dyno=web.3 queue=0 wait=0ms service=159ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.7 queue=0 wait=0ms service=162ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 app[web.7]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.10 queue=0 wait=0ms service=73ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.10]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.12 queue=0 wait=0ms service=179ms status=200 bytes=1132
2
2011-07-12T16:43:37+00:00 app[web.3]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700

New Relic

We use New Relic in production so I figured we should use it for these tests:

heroku addons:add newrelic:standard --remote staging

I started small: 5000 requests, 5 concurrent users, 2 dynos. Then, I added concurrent users until I could see the “request queuing” portion of the New Relic add-on:

App server time

The left-hand mountains represent when I got up to 4 dynos and was hitting the app with unlikely amounts of traffic. The green portion is the “request queuing” time.

The right hand hills represent when I cranked the dynos up to 12 and was hitting the app with best-case scenario traffic (100% click-through rate on the emails) from three laptops. No request queuing time and pretty nice numbers:

  • 5,250 requests per minute
  • 50ms average response time

Those numbers and the chart above come from what New Relic calls the “app server” stats. The “end user” stats look a little different:

End user time

You can see that even though we’re using Jammit for asset packaging, there’s still an opportunity to improve DOM processing and page rendering.

Ideally, we’d be under 2 seconds end user time.

Conclusion

However, this was enough information in combination with their historical email click-through rates to give the team confidence. In total, this took less than half an hour and most of that time was spent working on other things while the tests ran.

Post Script: Right action, right time

I didn’t add caching (page, action, fragment, or otherwise) at all. Split testing code already kept the homepage from being trivial to cache so if it wasn’t necessary, I wanted to avoid it. The data said it wasn’t necessary.

Tagged:

Comments (View)

Testing Cron on Heroku

Deploying cron to Heroku is really…pleasant. Click “Daily” or “Hourly” cron and without any tedious setup of scripts, ensuring output is logged, or referring to CronWTF. Testing it though, is a pain! No longer should this be the case.

Cron is serious business.

Outside of Heroku, splitting up daily and hourly cron tasks is easy: just have a script/cron_hourly and script/cron_daily in your Rails app, and have fun configuring that on your server. On Heroku, it’s handled in one Rake task. Here’s an example from the Dev Center:

desc "This task is called by the Heroku cron add-on"
task :cron => :environment do
  if Time.now.hour % 4 == 0 # run every four hours
    puts "Updating feed..."
    NewsFeed.update
    puts "done."
  end

if Time.now.hour == 0 # run at midnight
    User.send_reminders
  end
end

Two things stand out here. First, checking for hourly/daily tasks is done by looking at Time.now. Second, that’s a lot of logic to put in a Rake task, and not write a test!

We’ve talked about testing Rake integration before, and we’re going to use a similar pattern here: extract the Rake task into a model. For Radish, our Rakefile now has:

desc "Run cron job"
task :cron => :environment do
  Cron.run
end

Our Cron class now has to handle the daily and hourly tasks. For now, this is all done in the run class level method. For Radish, the method has to:

  • Archive data hourly for our new historical graphs
  • Activate accounts daily, which will prevent a reminder email from being sent out.

The test

I ended up just using RSpec to test this out. Timecop helps with freezing time in the right place, and Bourne gives us test spies to make sure the right methods get called. Here’s the test I ended up with:

require 'spec_helper'

describe Cron do
  before do
    Account.stubs(:activate)
    Archive.stubs(:store)
  end

  let!(:project1) { Factory(:project) }
  let!(:project2) { Factory(:project) }

  after do
    Timecop.return
  end

  it "runs nightly" do
    Timecop.freeze(Time.now.midnight)
    Cron.run

    Account.should have_received(:activate)
    Archive.should_not have_received(:store)
  end

  it "runs hourly" do
    now = Time.now.midnight + 1.hour
    Timecop.freeze(now)
    Cron.run

    Account.should_not have_received(:activate)
    Archive.should have_received(:store).with(project1, now, now - 1.hour)
    Archive.should have_received(:store).with(project2, now, now - 1.hour)
  end
end

Let’s start from the top of the test here. The before block uses Bourne to stub out the two class level methods on other models in the application, so we can assert they were called later. We then hook up two let! blocks for two projects, which we will use later. let! as opposed to let in RSpec will force those blocks to be evaluated for each test run instead of being lazy evaluated when they are referenced. Finally, since we’re going to freeze time for each test, we have to return to the system time in the after block.

Our two tests verify our daily and hourly scenarios. The first freezes time at midnight, which may not be exactly when Heroku runs our cron job, but all we care about is that the daily task runs only once. The hourly test freezes time at 1:00AM and checks that only the hourly task gets run, and not the daily.

I could have gone a little more gung-ho on this test, perhaps running through an entire 24 hours and making sure the daily task was only called once, but this was good enough.

The implementation

Here’s what I ended up with in my Cron model:

class Cron
  def self.run
    now = Time.now

    Project.find_each do |project|
      Archive.store(project, now, now - 1.hour)
    end

    if now.hour == 0
      Account.activate
    end
  end
end

The implementation ended up to be pretty simple: grab the time, archive always (since the task is run hourly), and if it’s run in the 12:00AM hour, activate accounts.

Pushing this code down into a model makes more sense now…too much code that deals with models and not test data or factories in a Rake task always smells a bit funky to me. It’s also much easier to refactor the code now that we’re in a real model and we have a testing feedback loop in place. For instance, the Project.find_each loop could easily be extracted into the Project class.

Hacking your time zone

During this process I learned of a UNIX trick that can help with testing this locally: the TZ flag. The appropriately named UNIX Power Tools puts it best:

The TZ environment variable is a little obscure, but it can be very useful. It tells UNIX what time zone you’re in.

Most of the time scripts will get this from your environment, but you can override it. Here’s a simple way to test this:

% TZ=UTC+5 ruby -e "puts Time.now"
2011-07-05 11:30:48 -0500

% TZ=UTC-8 ruby -e "puts Time.now"
2011-07-06 00:30:42 +0800

% TZ=UTC ruby -e "puts Time.now" 
2011-07-05 16:30:53 +0000

So basically, if you want to force it to be midnight when running a test or from a small script, you can use this environment variable to add/subtract time from your current time zone.

Less of a pain

Testing cron is now actually feasible, and now you can be assured your task will work without waiting an entire hour or day to find out. Which of course, means you can ship it faster!