giant robots smashing into other giant robots

Written by thoughtbot

hrward

Redis partial word match — you (auto)complete me

Redis autocomplete

We can use partial word matching to rapidly search strings of text such as Names, Cities, States, etc.

We can do this by indexing strings into Redis sets based on partial matches of the string. The indexing process takes a string and breaks it into left-bound substrings which are placed into the appropriate Redis sets.

Partial word matching movie titles

In this example we’ll enable partial word matching queries of movie titles.

The movie titles will be stored in plain text. The keys are simple key-value pairs that are stored with the key structure movies:id, where id is incremented:

$ redis-cli get movies:1
"Bad Santa"
$ redis-cli get movies:2
"Batman"
$ redis-cli get movies:3
"Bad Company"

The movie Batman would be decomposed into the following sets (assuming that the indexing started at two characters):

ba
bat
batm
batma
batman

We’ll index the movie titles into sets based on partial match of their titles. To prevent any key collisions in Redis we’ll create keys with the following structue index:abc:movies.

View the indexed ids using Redis smembers command:

$ redis-cli smembers index:ba:movies
1) "1"
2) "2"
3) "3"

$ redis-cli smembers index:bat:movies
1) "2"

Note: The keys for the movie titles have been parameterized (lowercase and spaces replaced with dashes).

We’ll use a Ruby class Autocomplete to wrap the Redis calls:

# autocomplete.rb
require 'redis'

class Autocomplete
  def initialize(partial_word)
    @partial_word = partial_word
  end

  def movies
    redis.sort "index:#{partial_word}:movies", by: :nosort, get: 'movies:*'
  end

  private

  def redis
    Redis.current
  end

  attr_reader :partial_word
end

movies = Autocomplete.new('bat').movies

We’ll use the sort command with the by: :nosort parameter to query the partial word match index and return the matched movie names.

Written by Harlow Ward

Related Reading

Redis Set Intersection - Using sets to filter data

Redis Pub/Sub…how does it work?

hrward

Redis Set Intersection - Using sets to filter data

Redis is a fantastic lightweight key-value store. It also acts as a data structure server for storing data types like lists, sets, and hashes.

With Redis sets we can store records in groups based on time. The sunion operator then enables us to group multiple sets together based on a particular time frame.

Example application

Search flights based on a departure city, arrival city, and the date/time of departure.

Redis set intersection

Id Departure Time Airline Departure City Arrival City
1 03/23/2013 1:30 pm Virgin America Los Angeles New York
2 03/23/2013 1:45 pm Virgin America San Francisco New York
3 03/23/2013 1:45 pm American Airlines San Francisco New York
4 03/23/2013 1:50 pm American Airlines Los Angeles Boston
5 03/23/2013 2:45 pm Southwest San Francisco New York
6 03/23/2013 3:30 pm Southwest San Francisco New York

The flight data is stored as strings in Redis using key-value pairs with the following key structure:

get flights:1
"1, Virgin America, 03/23/2013 1:30 pm, San Francisco, San Diego"
get flights:2
"2, Virgin America, 03/23/2013 1:45 pm, San Francisco, New York"
...
get flights:6
"6, Southwest, 03/23/2013 3:30 pm, San Francisco, New York"

Group flights by departure time

To group flights by departure time we’ll create a Redis set for each hour of the day, and flights will be added to the appropriate set based on its epoch departure time.

smembers departure_time:1364068800:flights
1) "1"
2) "2"
3) "3"
4) "4"
smembers departure_time:1364072400:flights
1) "5"
smembers departure_time:1364076000:flights
1) "6"

Group flights by city

Flights will also be grouped into sets based on their departure and arrival cities. The city names have been parameterized to keep a consistent key structure.

Departing flights:

smembers cities:los-angeles:departures
1) "1"
2) "4"
smembers cities:san-francisco:departures
1) "2"
2) "3"
3) "5"
4) "6"

Arriving flights:

smembers cities:boston:arrivals
1) "4"
smembers cities:new-york:arrivals
1) "1"
2) "2"
3) "3"
4) "5"
5) "6"

Find flights from San Francisco to New York

To find all flights departing from San Francisco and arriving in New York we can use sinter to get the intersection of two sets.

sinter cities:san-francisco:departures cities:new-york:arrivals
1) "2"
2) "3"
3) "5"
4) "6"

However, this technique does not allow us to filter flights over a range in time.

To find flights departing on 03/23/2013 between 1:00 pm and 3:00 pm, we’ll need to group the sets of departure times, and then perform an intersection with the resulting set.

Redis set intersection

Redis set intersection

Here are the steps we’ll take:

  1. Create a new temp_set using sunionstore to group flights by departure time.
  2. Intersect the temporary set with the departure and arrival sets.
  3. Loop over the results of the intersection and generate an array of flight keys.
  4. Use mget to fetch all the flight data.

We’ll use a combination of Ruby and Redis to achieve this:

# flight_finder.rb
class FlightFinder
  def initialize(attrs)
    @arrival_city = attrs.fetch(:arrival_city).parameterize
    @departure_city = attrs.fetch(:departure_city).parameterize
    @window = attrs.fetch(:window)
  end

  def flights
    redis.mget(*flight_keys)
  end

  private

  attr_reader :arrival_city, :departure_city, :window

  def flight_keys
    flight_ids.map { |id| "flights:#{id}" }
  end

  def flight_ids
    redis.multi do
      redis.sunionstore('temp_set', *departure_time_keys)
      redis.sinter('temp_set', departure_cities_key, arrival_cities_key)
    end.last
  end

  def departure_cities_key
    "cities:#{departure_city}:departures"
  end

  def arrival_cities_key
    "cities:#{arrival_city}:arrivals"
  end

  def departure_time_keys
    window.range_keys { |epoch_time| "departure_time:#{epoch_time}:flights" }
  end

  def redis
    Redis.current
  end
end

We’ll introduce the concept of a window to avoid a long parameter list, and group parameters that naturally fit together. This allows us to encapsulate behavior between the start_at and end_at attributes.

# window.rb
class Window
  ONE_HOUR_IN_SECONDS = 3600

  def initialize(start_at, end_at)
    @start_at = convert_to_epoch(start_at)
    @end_at = convert_to_epoch(end_at)
  end

  def range_keys
    (start_at...end_at).step(hour).map do |epoch_time|
      yield(epoch_time)
    end
  end

  private

  attr_reader :start_at, :end_at

  def convert_to_epoch(string_or_time)
    DateTime.parse(string_or_time).to_time.to_i
  end

  def hour
    ONE_HOUR_IN_SECONDS
  end
end

Create a command line script.

# flights.rb
require 'flight_finder'
require 'window'

window = Window.new(
  start_at: 'March 23, 2013 1:00 pm',
  end_at: 'March 23, 2013 3:00 pm'
)

flight_finder = FlightFinder.new(
  departure_city: 'San Francisco',
  arrival_city: 'New York',
  window: window
)

puts flight_finder.flights

Run the command line script.

$ ruby flights.rb
2, Virgin America, 03/23/2013 1:45 pm, San Francisco, New York
3, American Airlines, 03/23/2013 1:45 pm, San Francisco, New York
5, Southwest, 03/23/2013 2:45 pm, San Francisco, New York

Takeaways

  • Redis set intersection is a powerful filtering technique.
  • Use sunionstore to create new sets on the fly.
  • Redis data structures are fantastic for slicing and dicing data.

Written by Harlow Ward.

Next Steps & Related Reading

Redis partial word match — you (auto)complete me

Redis Pub/Sub…how does it work?

dancroak

Recipe: A/B testing with KISSMetrics and the split gem

A/B testing can turn skeptics into believers. Jamie, a designer at 37signals, recently shared a great case study of A/B testing Highrise.

So, what are the mechanics of actually setting up an A/B test in a Rails app?

One approach we’re trying right now on Trajectory is using the split gem with KISSMetrics. For the first few months of Trajectory’s life, we felt it was necessary to introduce Trajectory and explain why we made it in the face of existing tools.

Now, it’s time to explain its benefits on its own merits and see how well that resonates with potential users versus the old copy. This recipe will show that example. We’ll also be testing more dramatic layouts, which this combination can also handle.

Setup

Gemfile:

gem "split"

config/initializers/split.rb:

Split.redis = ENV['REDISTOGO_URL'] || 'redis://localhost:6379'
Split.redis.namespace = "split:trajectory"

Red

Let’s use the Cucumber directory convention.

features/visitor/views_homepage.feature:

Scenario: Original landing page
  When I go to the home page with the "original" alternative for the "landing_page" experiment
  Then I should see "Over the past 8 years, we've used many tools"
  And KISSmetrics receives the following properties:
    | property     | value    |
    | landing_page | original |

Scenario: New copy on landing page
  When I go to the home page with the "new_copy" alternative for the "landing_page" experiment
  Then I should see "One gorgeous tool that everyone actually LIKES to use"
  And KISSmetrics receives the following properties:
    | property     | value    |
    | landing_page | new_copy |

The split gem’s documentation provides a way to override the alternatives.

features/support/paths.rb:

when /^the home page with the "([^"]+)" alternative for the "([^"]+)" experiment/
  "/?#{$2}=#{$1}"

We can figure out the expected Javascript from the KISSMetrics API documentation.

features/step_definitions/kissmetrics_steps.rb:

Then /^KISSmetrics receives the following properties:$/ do |table|
  table.hashes.each do |hash|
    property = hash['property']
    value = hash['value']
    expected_javascript = %Q{_kmq.push(["set","#{property}","#{value}"]);}

    page.should have_content(expected_javascript)
  end
end

Green

In this example, we set up two partials that simply contain a translation.

The translation uses the Rails i18n API and is backed by Copycopter in order to make changes to it without re-deploying.

app/views/homes/_new_copy.html.erb:

<%= t(".letter-new-copy", :default =&gt; %{
  <p><strong>Less frustration, more joy: there's a better way to build software.</strong></p>
  <p>You know that your software planning tools aren't perfect.  It's not clear what to do next.  Things gets lost in the shuffle when you copy stuff around between different tools - tools for having product discussions, reviewing wireframes and usability test results, building to-do lists for developers, and keeping track of bugs.</p>
  <p>We had the same problems.  We’re thoughtbot, a web design and development agency, and that's why we made Trajectory.</p>
  <p>Imagine if you could build better software, faster.  Imagine your teammates not waiting on each other, having a clear sense of what to do next.  Imagine if everything you needed to plan and build was in one place,  with no friction or overhead in the process.  One gorgeous tool that everyone actually LIKES to use - managers, clients, designers, and developers alike.</p>
  <p>It's super easy to try out for free.  Hit the ground running by importing your Pivotal Tracker project or invite your current team members.</p>
}) %>

app/views/homes/_original.html.erb:

<p><strong>Hi, we&rsquo;re thoughtbot, a web design and development agency.</strong></p>
<%= t(".letter", default: %{
  <p>Over the past 8 years, we've used many tools for project communication and planning.</p>
  <p>Basecamp was great for discussion and communication. Pivotal Tracker was great for user stories and emergent planning.</p>
  <p>We've grown tired of having one tool that designers love, one tool that developers love, and no tool that clients love.</p>
  <p>We created Trajectory to solve our own problems. We now use it on all of our projects. Maybe it can solve your problems too.</p>
}) %>

Using split is pretty simple.

app/views/homes/show.html.erb:

<%= render partial: ab_test("landing_page", "original", "new_copy") %>

split provides a web interface but we have all our funnel metrics in KISSMetrics so we want to send the data there.

app/views/shared/_javascript.html.erb:

<script type="text/javascript">
  <% Split::Experiment.all.each do |experiment| %>
    _kmq.push(['set', { '<%= experiment.name %>': '<%= ab_test(experiment.name, *experiment.alternative_names) %>' }]);
  <% end %>
</script>

Putting A/B testing in context

This is a recipe for “how” to A/B test but if you’re interested in “when” to test, see our A/B testing page in our playbook.

Written by .

qrush

Redis Pub/Sub…how does it work?

Redis is a key/value store, but it’s jam-packed with a ton of other little utilities that make it a joy to explore and implement. Two of these are the PUBLISH and SUBSCRIBE commands, which enable you to do quick messaging and communication between processes. Granted, there’s plenty of other messaging systems out there (AMQP and ØMQ come to mind), but Redis is worth a look too.

The way it works is simple:

  • SUBSCRIBE will listen to a channel
  • PUBLISH allows you to push a message into a channel

Those two commands are all you need to build a messaging system with Redis. But, what should you build?

Demo

I tend to build quick little games to learn new ideas, frameworks, languages, etc. For Redis pub/sub I chose to emulate IRC, since “channels” are essentially the same concept for an IRC server. A user connects, talks into a channel, and if others are there, they get the message. This is the basic concept, we’re not going to re-implement the IRC protocol here.

I scraped together two tiny little Ruby scripts for this. The repo is on GitHub, if you want to play with it. Make sure you’re running Redis locally first! Install Redis via redis.io if not.

PUBLISH

First up, pub.rb publishes messages to a channel. We’re going to bring in the Redis gem to use for making a client, and JSON in order to have an easy transport format. We could have used Ruby’s Marshal class instead to serialize, but this works fine and is human readable.

# usage:
# ruby pub.rb channel username

require 'rubygems'
require 'redis'
require 'json'

$redis = Redis.new

data = {"user" => ARGV[1]}

loop do
  msg = STDIN.gets
  $redis.publish ARGV[0], data.merge('msg' => msg.strip).to_json
end

This script will run interactively, once provided a channel and username from the command line arguments. It then fires off the PUBLISH command every time the user types a message and hits Enter. It publishes the message to a channel (in ARGV[0], the first command line argument.

So if we were to run:

% ruby pub.rb rubyonrails qrush
Hello world

Our Redis client would then send a PUBLISH command down the “rubyonrails” channel with the given message. The message itself is JSON and looks like:

{
  "msg": "Hello world",
  "user": "qrush"
}

We can actually verify this with the MONITOR command, which will spit out all commands the Redis server has processed. If we had MONITOR running before sending the above hello world snippet, it shows:

% redis-cli
redis> MONITOR
OK
1306462616.036890 "MONITOR"
1306462620.482914 "publish" "rubyonrails" "{\"user\":\"qrush\",\"msg\":\"Hello world\"}"

Currently this simple script doesn’t support publishing under more than one channel. Opening up more than one pub.rb process will let you do that.

SUBSCRIBE

Woot! Now that we have messages being sent we have to listen to them. Enter sub.rb:

require 'rubygems'
require 'redis'
require 'json'

$redis = Redis.new(:timeout => 0)

$redis.subscribe('rubyonrails', 'ruby-lang') do |on|
  on.message do |channel, msg|
    data = JSON.parse(msg)
    puts "##{channel} - [#{data['user']}]: #{data['msg']}"
  end
end

Once again we’ll need Redis and JSON to connect and parse messages. The initialization process for Redis is different this time: it’s using a new :timeout option. This will force the Redis client to never timeout when waiting a response, so we’ll wait forever for messages to come in. Perfect!

This script subscribes to two different channels: rubyonrails and ruby-lang. Basically, once the interpreter reaches the subscribe block, it will never exit and continue to wait for messages.

When a message comes in, the message block is fired, yielding two arguments: the channel the message was on, and the actual data sent down the pipe. Parsing that JSON chunk then allows us to spit out who said it, where they said it, and what was actually said. Here’s what it looks like if some other clients are publishing messages after we run our sub.rb file. (The published messages arrive in the order they are sent, but that’s hard to display in text):

% ruby pub.rb rubyonrails qrush
Whoa!
`rake routes` right?

% ruby pub.rb rubyonrails turbage
How do I list routes?
Oh, duh. thanks bro.

% ruby pub.rb ruby-lang qrush  
I think it's Array#include? you really want.

% ruby sub.rb 
#rubyonrails - [qrush]: Whoa!
#rubyonrails - [turbage]: How do I list routes?
#ruby-lang - [qrush]: I think it's Array#include? you really want.
#rubyonrails - [qrush]: `rake routes` right?
#rubyonrails - [turbage]: Oh, duh. thanks bro.

Whoa! How does it work!? Under the hood in the redis-rb client, the subscribe block is actually stuck in a Ruby loop (called from Redis::SubscribedClient#subscription). The client is going to continually attempt to read from the socket for messages, until there’s an error of some kind (but not a Timeout!). The redis-server then keeps a list of channels and patterns for each connected client, and publishes messages to them when a PUBLISH command is sent.

Usage

Although this is a simple example of how Redis pub/sub works, it’s pretty cool to see what others have done with it. Some examples include Convore, which is used as a pretty central part of their infrastructure, and Realie, a real-time code editor like Etherpad. Another good link to check out is Salvatore Sanfilippo’s recent interview on The Changelog (around ~24:00-27:00) where he discusses that developers are switching from other MQs to Redis due to its simplicity and performance.

If you’re using Redis’ Pub/Sub within your infrastructure, we’d love to hear your feedback on how Radish can provide more visibility to what your messaging system is up to.

Next Steps & Related Reading

Redis sets - The intersection of space and time

Redis partial word match — you (auto)complete me

qrush

Radish: Dig Deep Into Redis

It’s no secret that we’re big fans of Redis here at thoughtbot. Redis has great data structures, plenty of fantastic libraries for interacting with it, and is plain fast. Several questions kept arising though every time we deployed Redis to Hoptoad and several client projects: What’s actually going on inside of Redis? What data is being stored, how much, and how fast?

Our answer to these questions is a new Redis analysis and monitoring service, Radish.

Radish works side-by-side with your existing Redis instances in your hosting environment. All you need to do is start a daemon which handles connecting to Redis and sending data out to Radish over HTTPS.

INFO

So, what can Radish do?

Picking out spikes in Redis operations is helpful to correlate to web traffic, or lack thereof! We also split out the types of commands for reads vs. writes (GET vs SET, for instance). “Other” commands are usually admin/maintenance actions such as INFO.

We also track several stats about your Redis instance, such as average commands per second, memory usage, changes since last save, and keyspace size. A growing keyspace and memory size is a constant effort when maintaining a Redis instance, and Radish can help you track it and keep it to a manageable level. Tracking changes since last save has also helped us figure out that we don’t SAVE our Redis data enough, which could have dire effects if the Redis instance was to go down.

Breakdown of high-frequency commands, keys, and namespaces assist in determining what your instance is processing. Learn what keys are being written to and when. Highlight the areas of your code that might be wasting memory on your Redis instance. Radish makes it possible to show what might be blocking Redis from serving requests since it is single-threaded.

MONITOR

Radish is already helping us understand what our Redis instances are up to without a lot of custom scripts and temporary hacks. Sign up, get it running with your Redis instances, and let us know how it goes!

Huge thanks to my fellow robots for helping me getting this idea off the ground. Our beta testers also deserve kudos for hammering us with live traffic and exposing lots of fun bugs.