Better Single-Table Inheritance

ruby rails
Posted on: 2013-05-03

I've seen several articles recently on the evils of single table inheritance in Rails apps. While it can be problematic, I recently tried an approach to STI that I think works well.

Here's what it boils down to: Put common attributes in a single table, non-shared attributes in separate tables with foreign key references, and use object delegation so that each model transparently pulls what it needs from both.

This solution was inspired by my recent reading of a couple of books on object-oriented design: Sandy Metz' Practical Object-Oriented Design in Ruby and Russ Olsen's Design Patterns in Ruby.

For details, read on.

Single Table Inheritance Blues

I've been working on an app where contracts are a foundational model. The app needs to handle several kinds of contracts, which have some attributes in common, but each kind also has unique attributes.

As a simplified example, let's say that all Contracts have start_date and end_date attributes. In addition, ConstructionContracts have a site_address, etc, and WrestlingContracts have a win_ratio, etc.

In plain Ruby, it's normal for subclasses to have their own distinct methods in addition to the parent class's. But that doesn't work well for single table inheritance in Rails. If you store two kinds of contracts in one table, ActiveRecord will give them both accessors for all the columns. You won't validate the presence of a site_address for a WrestlingContract, but it will still have the methods site_address and site_address=.

This is bad OO. It's also bad database normalization: lots of NULLs in the table. That, in turn, means you can't set non-shared columns to NOT NULL as a last-resort guard against bad data.

These are the kinds of criticisms I've seen recently of STI. But to be fair, you might also say that this is bad STI: it should really only be used when the subclasses have exactly the same attributes but different behavior.

The often-recommended solution is to abandon STI. But...

INNER JOIN a_buncha_different_tables ON...

If we discard STI, the only solution left is to put the different contract types in separate tables.

But that was a problem for me, because a major requirement in this system is the ability to join other tables to the contracts table and run large reports. If there were no single contracts table, that would become awkward and slow.

Also, the various contract tables would all have common attributes: start_date, end_date, etc. Duplicating these would create the same problems generally associated with duplication; for instance, if we decided that all contracts need a minimum_fee, we'd have to modify all the contract tables.

Most importantly, I had existing code that used the various contract subclasses. I wanted a solution that would be transparent to all users of the contract classes.

Do it With Delegation

After some thought, I decided to have a single contracts table with single table inheritance, but only store in it those attributes which are common to all contract classes. The other attributes are handled by composition: each contract subclass declares something like has_one :details, class_name: 'WrestlingContractDetails', and that association handles its unique attributes.

To make it transparent from the outside, each subclass:

  1. declares what attributes it delegates to its details object
  2. instantiates the details object, if necessary, whenever one of the delegated methods is called
  3. validates the attributes that it delegates, so that validation errors are on the contract itself and not the hidden details object

So that's the basic idea: single-table inheritance, with each subclass stored in the parent table, but each having a different details association to which it delegates its unique attributes.

Now the implementation.

The Code

The core of the implementation is this method on the contract base class:

# Use in subclasses like:
# delegate_details :win_ratio, :tanning_time, to: :wrestling_details
def self.delegate_details(*attributes)
  options = attributes.extract_options!
  association_name = options.fetch(:to) {
    raise "You must specify the name of the details association"

  define_method association_name do
    super() || send("build_#{association_name}")

  attributes.each do |attribute_name|
    # Getter, setter, and boolean getter (in case it's a boolean attribute)
    # (`extend Forwardable` from stdlib to get `def_delegators` method)
    def_delegators association_name, 
      :"#{attribute_name}", :"#{attribute_name}=", :"#{attribute_name}?"

This isn't the most obvious way to do things, but I tried several other things first.

At first I thought I could just use ActiveRecord's delegate method, but that will fail if the association isn't loaded or created first.

Then I thought I could add something like after_initialize { build_details }, but that doesn't get called if you do WrestlingContract.create({start_date:, win_ratio: 0.7}), because the initialization block isn't run.

With this implementation, I was only left with one problem that I know about and that I expect would affect others: WrestlingContract.where('win_ratio > 0.5') would fail because the contract doesn't have that column. This was solved by giving it default_scope joins(:details), so that the SELECT statement would have that column available.

Alternative cat-skinning methodologies

This approach works well for us, using ActiveRecord with MySQL. For those of you using PostgreSQL, you might be able to use a similar technique without the separate detail tables by storing the extra attributes in an hstore column.

Update: PostreSQL actually supports table inheritance, which looks like a perfect solution (I haven't tried it - since writing this article I've changed jobs and don't currently have code needing STI.)


So what did we learn today? Let's summarize.

  • Single table inheritance can help us avoid duplicating columns for similar models
  • It can also tempt us to add nullable columns and share methods sloppily across models
  • Pulling non-shared attributes into separate tables cleans up both the schema and the model interfaces
  • Delegation can make this split invisible to the models' users, giving us the best of both worlds