Using ActiveRecord with EventMachine

Given all my work with Fibers and EventMachine over the last three months, it should come as no surprise that I’ve been working on infrastructure based on Fibers and EventMachine to get maximum scalability without the callback style of code which I dislike for many reasons. Watch my talk on scaling with EventMachine if you need more background on the problem.

Now that I have RabbitMQ, Cassandra, Solr and the Amazon AWS services evented, the only holdup was ActiveRecord. Some people may advocate using another ORM layer but when you have 2-3 other Rails apps, all sharing 100+ models, you can’t afford to maintain two separate ORM layers. Plus, frankly I like the Rails stack: it works pretty well, is thoroughly documented and every Ruby developer is familiar with it.

So what do we need to do to get AR working event-style? At a high level, there’s two things required:

  • The database driver itself must be modified to send SQL asynchronously. The postgresql driver, for instance, calls the exec(sql) method for all traffic to the database. So we just need to provide an exec method which uses Fibers under the covers to work asynchronously.
  • AR’s connection pooling needs to be Fiber-safe. Out of the box, it is Thread-safe. Since we are using an execution model based on a single Thread with multiple Fibers, all the Fibers would try to use the same connection, with disastrous consequences.

These are the things that em_postgresql does.

  • postgres_connection is a basic, EM-aware Postgres driver. It provides the Fibered exec() method which makes the whole thing asynchronous.
  • em_postgresql_adapter.rb wraps postgres_connection to make it a proper ActiveRecord driver.
  • patches.rb overrides a bunch of AR’s internal connection pooling to make it Fiber-friendly.

Unfortunately the latter makes one hack necessary – we have to have a list of current Fibers to release any lingering connections associated with those Fibers. The Threaded version can use Thread.list but Ruby does not provide an equivalent method for Fibers. Instead I require the application to register a FiberPool with AR to clear stale connections.

So what does it all mean? Well, here’s a Sinatra application that uses plain old ActiveRecord and is completely asynchronous! Try ab -n 100 -c 20 http://localhost:9292/test to hit the app with 20 concurrent connections; it will process them all in parallel, without any painful threading issues (autoloading, misbehaving extensions, etc). Awesome!

You should guess what’s next. Coming soon: the whole Rails stack, running asynchronously…

3 thoughts on “Using ActiveRecord with EventMachine”

  1. Great work! Will this also work for Rails 3? Hopefully, the decoupling introduced there would negate the need for monkey patching.

  2. This is one of the coolest things being done to rails since the decisions that led to rails3.

    I’m glad event-based programming is getting more popular in web programming, with node.js being the most prominent example.

    Any chance of porting your solution to non-fiber ruby 1.8, or do you want it to remain 1.9-only?

  3. Thanks guys. I’m not targeting Rails 3 as it is still a moving target, i.e. under construction. Now that I’ve got Rails 2.3 working though, I will probably spend some time looking into Rails 3.

    There’s no way to port my solution to non-Fibers. You’d need to look at something like Cramp if you want an asynchronous web framework without Fibers.

    Look for Joe Damato and Aman Gupta’s post on backporting Fibers to Ruby 1.8. It can be done.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>