The Perils of Uniqueness Validations

The Perils of Uniqueness Validations

Your Rails application probably makes use of uniqueness validations in several key places. This validation provides for a nice user experience when duplicate records are detected but as we will see in a moment, is not enough to ensure data integrity.

What Can Go Wrong?

Let’s take a look at a our sample User class.

class User
  validates :email, presence: true, uniqueness: true
end

When you persist a user instance, Rails will validate your model by running a SELECT query to see if any user records already exist with the provided email. Assuming the record proves to be valid, Rails will run the INSERT statement to persist the user. This works great in development and may even work in production if you’re running a single instance of a single process web server.

But you’re not running a lone instance of WEBrick, are you? No, to maximize requests per minute, you’re running Unicorn on multiple Heroku dynos, each with multiple web processes. Let’s take a look at what happens if just two of these processes are trying to create a user with the same email address at around the same time:

validate uniqueness without index

Uh oh. Now we’ve got a problem. We wanted the uniqueness validation to keep data consistent with our intentions, but it has failed. Why? Because we never told the database of our intentions.

Unique Indexes to the Rescue

Let’s make sure the database is in on the plan by telling it to create a unique constraint on users.email. We do this with a unique index.

class AddEmailIndexToUser
  def change
    # If you already have non-unique index on email, you will need
    # to remove it before you're able to add the unique index.
    add_index :users, :email, unique: true
  end
end

Alternatively, you can create the unique index when generating the migration or model with:

rails generate model user email:string:uniq

With the index in place, how does the above scenario play out now?

validate uniqueness with index

Now we have the database acting as our last line of defense in our war on inconsistent data. The second save operation will generate an ActiveRecord::RecordNotUnique exception. In most cases, this will result in an application error. If you need to provide a better experience, you can rescue and handle that exception in the controller action or use rescue_from at the class level.

Where Else Can Unique Indexes Help?

Your Rails application may also have several has_one relationships. Specifying a has_one relationship merely sets the relationship methods up to deal with a single object rather than a collection. has_one on its own does nothing to ensure data integrity.

For example, we’ve decided to add a Profile class to our application. Users will have a single profile record. Our classes now look like this:

class User
  has_one :profile
  validates :email, presence: true, uniqueness: true
end

class Profile
  belongs_to :user
end

We’d expect that the profiles table would contain no duplicate user_id values, but has_one doesn’t make any promises about that. We’ll have to add a unique index to profiles (on user_id) to prevent data inconsistency.

How Can I Find Problems in My Application?

You could search your project for validates_uniqueness_of, uniqueness: and has_one and then cross reference that with a list of indexes pulled from your database, or you could let a gem do that for you. Consistency Fail is a gem that finds missing unique indexes for you. Simply install and run it as detailed in the README.

One problem I’ve seen with consistency_fail is that the has_one searches do not properly recognize polymorphic relationships. It will suggest a unique index on the id column when what you really need is a compound index on the type and id columns.

In Conclusion

Rails does many things, but data integrity validations are not one of them. Your relational database is designed to enforce data integrity; let it.

Derek Prior Developer