giant robots smashing into other giant robots

Written by thoughtbot

dancroak

Online presentation with thoughtbot and Heroku

Adarsh, Alex, and I are giving a free online presentation with Abe from Heroku about how their platform fits into our process. Over 1,400 people have registered so far! The details:

  • Tomorrow, April 17
  • Once at 7:00am PST (2:00pm GMT / 7:30pm IST)
  • A second time at 10:00am PST (5:00pm GMT / 10:30pm IST)

Heroku event

RSVP here.

We will be live coding(!), making a single change to a live web application hosted on Heroku within the context of a common process for us:

We will also show a few other goodies:

We will share an open source web application and a read-only Trello board during the presentation. There will be a question-and-answer session at the end.

Hope to see you online! RSVP here.

UPDATE: Heroku has posted a recording to YouTube.

dancroak

How to Splunk with Heroku

Splunk is company that offers logging services. They went public last year, have a market cap of over $3 billion, and are headquartered in San Francisco’s SoMa neighborhood.

I’ve tried Loggly and Papertrail. In my opinion, Splunk is the best of the bunch due to its:

  • Real-time or very-near-real-time data discovery.
  • Wildcard search.
  • Timespan dragging.

Loggly and Papertrail offer Heroku add-ons but Splunk doesn’t. So, setup is a bit more complex with Splunk. Here’s how to do it.

Go to Splunk Storm. Create an account.

Once signed in, create a project:

Create project

You can start with a free plan:

Choose plan

Click “Network data”:

Splunk dashboard

Click “Authorize your IP address”:

Network data

Click “Automatically”:

IP address authorization

You now have 15 minutes to send Splunk data. Copy the URL in the text box:

Automatic authorization

Then, add a Heroku syslog drain:

heroku drains:add logs4.splunkstorm.com:YOURSPLUNKPORT

Perform a few activities on your app to send data to the drain. Then, click “Explore data”:

Dashboard

Perform a search, maybe using wildcards:

Search

I haven’t been diligent about saving common searches. If you have interesting saved Splunk searches you can share, please comment.

Filter by dragging a timespan:

Timespan dragging

Watch how quickly the data loads.

On Rails apps, the default production log level includes enough data to be useful in Splunk. Change it to DEBUG only when debugging:

heroku config:add LOG_LEVEL=DEBUG

At the DEBUG level, Rails will print SQL queries to the logs, which can be useful but may also contain sensitive data as config.filter_parameters does not apply to SQL queries.

Written by .

dancroak

How to back up a Heroku production database to staging

It’s right there in the docs but I didn’t notice it until recently:

heroku pgbackups:restore DATABASE `heroku pgbackups:url --remote production` --remote staging

Boom! It transfers the production Postgres database to staging.

It’s much faster than db:pull, then db:push, which is what I used to do (like a sucker).

Setup:

git remote add staging git@heroku.com:my-staging-app.git
git remote add production git@heroku.com:my-production-app.git
heroku addons:add pgbackups --remote staging
heroku addons:add pgbackups --remote production

Create a database backup at any time:

heroku pgbackups:capture --remote production

View backups:

heroku pgbackups --remote production

Destroy a backup:

heroku pgbackups:destroy b003 --remote production

dancroak

7 minute ab: impatient man’s load tests for a Heroku app

We’ve been working with a client who recently launched a new service. The launch entailed their marketing team sending batches of emails to a 1 million+ person mailing list over 2 days. In the email, there’s a link to the homepage.

The client wanted some confidence that the home page of the Rails app, which is hosted on Heroku, would be able to handle the load generated from that traffic.

They didn’t need a heavy-duty load test, just a little assurance. In turn, I wanted something that was quick to set up and execute.

To repeat, I don’t consider this a rigorous load test. For that, look at something like Blitz.io or Tsung. This is a quick-and-dirty alternative.

Apache Bench

It doesn’t get quicker than apache bench:

man ab

The ab command I ended up with:

ab -n 50000 -c 50 -A user:password https://staging.ourapp.com/

50000 requests with 50 concurrent users. Basic auth is used on staging to keep the outside world from seeing the app before it’s unveiled. The trailing / is necessary.

I maxed out at 50 concurrent users because I read in Deploying Rails Applications by Ezra Zygmuntowicz that’s about the most that apache bench can reasonably simulate.

If I was testing a particular workflow, I may have used the -C flag with a session value grabbed from a browser. That way, every test would use the same session. For this scenario, however, I wanted to generate a new session on each request because I was testing many new users hitting the home page.

Heroku

To get more visibility into what was happening, I added a logging add-on:

heroku addons:upgrade logging:expanded --remote staging

While the tasks ran, I had a shell open tailing the log:

heroku logs -t --remote staging

It was mildly entertaining to watch the foreman-style logs fly by:

2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.9 queue=0 wait=0ms service=49ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.6]: 2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.6 queue=0 wait=0ms service=156ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 app[web.6]: 2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.2 queue=0 wait=0ms service=51ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.6]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -07002011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.15 queue=0 wait=0ms service=29ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.16 queue=0 wait=0ms service=58ms status=200 bytes=113222011-07-12T16:43:37+00:00 heroku[router]: GET dev.testkitchenschool.com/ dyno=web.3 queue=0 wait=0ms service=159ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.7 queue=0 wait=0ms service=162ms status=200 bytes=11323
2011-07-12T16:43:37+00:00 app[web.7]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.10 queue=0 wait=0ms service=73ms status=200 bytes=11322
2011-07-12T16:43:37+00:00 app[web.10]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700
2011-07-12T16:43:37+00:00 heroku[router]: GET staging.ourapp.com/ dyno=web.12 queue=0 wait=0ms service=179ms status=200 bytes=1132
2
2011-07-12T16:43:37+00:00 app[web.3]: Started GET "/" for 75.150.96.93 at 2011-07-12 09:43:37 -0700

New Relic

We use New Relic in production so I figured we should use it for these tests:

heroku addons:add newrelic:standard --remote staging

I started small: 5000 requests, 5 concurrent users, 2 dynos. Then, I added concurrent users until I could see the “request queuing” portion of the New Relic add-on:

App server time

The left-hand mountains represent when I got up to 4 dynos and was hitting the app with unlikely amounts of traffic. The green portion is the “request queuing” time.

The right hand hills represent when I cranked the dynos up to 12 and was hitting the app with best-case scenario traffic (100% click-through rate on the emails) from three laptops. No request queuing time and pretty nice numbers:

  • 5,250 requests per minute
  • 50ms average response time

Those numbers and the chart above come from what New Relic calls the “app server” stats. The “end user” stats look a little different:

End user time

You can see that even though we’re use the Rails asset pipeline asset packaging, there’s still an opportunity to improve DOM processing and page rendering.

Ideally, we’d be under 2 seconds end user time.

Conclusion

However, this was enough information in combination with their historical email click-through rates to give the team confidence. In total, this took less than half an hour and most of that time was spent working on other things while the tests ran.

Post Script: Right action, right time

I didn’t add caching (page, action, fragment, or otherwise) at all. Split testing code already kept the homepage from being trivial to cache so if it wasn’t necessary, I wanted to avoid it. The data said it wasn’t necessary.

Written by .