For the last couple of months, we’ve seen considerable growth on Hoptoad accounts and traffic. Thank you all! But this introduced new traffic patterns and challenges. During this time we’ve been mostly keeping up with this growth and making sure we can provide as reliable a service as possible. There have been some bumps along the way. This is what has happened, what we’ve done about it, and what is yet to come.

For over a year, Hoptoad has stored exception details as a gzipped XML on Amazon S3. When an error is POSTed to our API endpoint, we validate it, group it with similar errors, and store it on the app server’s file system. Every five minutes there was a cron job that would upload all these XML files to S3. These details were only available for viewing on the UI after they made it to S3. This is why, more often than we had liked, you would see the dreaded message “Details for this error are still being processed”. This served us well for some time, but we knew it was time to rethink this architecture.
There were many problems with this approach. The most obvious was that this “still processing” error was becoming more and more common, and this degraded the experience of viewing error messages for our users (us included). The first thing we did to improve that experience was rather simple and did not require wholesome architectural changes: Instead of trying to display the last notice that we got for that error group, we showed you the last processed error for that group. So therefore, instead of seeing the processing message, you would see actionable data for that exception so that you can get back to work fixing bugs.
Even though this helped the situation and the number of support requests greatly decreased, we always knew this was a temporary solution and we could do better. We needed a way to store error details in the life cycle of the request, in such a way that it was available immediately afterwards for viewing. Uploading to S3 became too slow for our needs.
Furthermore, this was not the only problem with this architecture. The larger problem is that because of our high traffic, we started running into all sorts of issues with either disk space filling up before our workers were able to push notice details to S3, or even worse, an application instance failing completely thus losing any unprocessed details. In those rare cases, another application instance would be automatically provisioned, and the XML on that filesystem would be lost.
In order to display exception details quickly, we decided to make use of MongoDB, removing temporary file system and S3 storage alltogether. When an exception hits our API, we do the same processing we’ve always done but store it in a MongoDB collection instead. The three main advantages to you are:
We can’t stop here. We have encountered numerous problems with our current environment, and we are working to improve our infrastructure. This has been our primary focus for the last couple of months.
We plan on migrating our application to a more traditional hosting environment. While we will continue to use virtualization for application servers and other utilities, our databases will now run on bare metal. We are confident that this will increase our overall performance even more, and provide a predictable path for growth. Among other things, this solves:
We have been forced to focus our efforts on performance improvements and architectural changes that can support the growth we’ve seen. We are very sorry for the bumps on the road along the way. We are also tired of feeling apologetic. Enough is enough. We have made changes to improve your experience as a customer, and we will continue to do so. Please bear with us until we’ve migrated our infrastructure. We’ll keep you updated as to the timeline for the hosting move. We look forward to being able to stop worrying about performance, and start worrying about how to improve the service by providing better features that make more use of the data, and help you handle your app’s bugs efficiently.
Amazon SES came out last week and… you know… shiny.
Right now, price. At our current email rates, we would save more than $10,000 in 2011 using Amazon SES over SendGrid for Airbrake.
However, SendGrid’s a reliable entity with more features (analytics, spam reports, etc.) so even with those savings, we’re leaving SendGrid yet on Airbrake.
In the meantime, we’re trying Amazon SES on another project that is in private beta to see how well it performs in terms of deliverability, blacklisting, etc.
A week after Amazon announced the service, there were plenty of libraries on Github for Amazon SES. I chose drewblas/aws-ses (the aws-ses gem).
How to use SendGrid in a Rails app:
MyRailsApp::Application.configure do
config.action_mailer.smtp_settings = {
address: ENV['SMTP_ADDRESS'],
authentication: :login,
domain: 'staging.myrailsapp.com',
password: ENV['SMTP_PASSWORD'],
port: 25,
user_name: ENV['SMTP_USERNAME']
}
end
You don’t need a special gem: it’s just SMTP.
Amazon SES requires some HMAC‘ing and other stuff, but when using a library, it’s still pretty easy and it has the same dependencies as Rails.
Add the gem to your Gemfile:
gem 'aws-ses', '~> 0.4.4', require: 'aws/ses'
Extend ActionMailer in config/initializers/amazon_ses.rb:
ActionMailer::Base.add_delivery_method :ses, AWS::SES::Base,
access_key_id: ENV['AMAZON_ACCESS_KEY'],
secret_access_key: ENV['AMAZON_SECRET_KEY']
Set the delivery method in config/environments/{staging,production}.rb:
config.action_mailer.delivery_method = :ses
That’ll do it. Happy emailing!
Written by Dan Croak.