January 27th, 2010 · 1 Comment
I gave a talk at Austin On Rails last night on using EventMachine, focused on maximizing concurrency when processing a message queue. There were a lot of questions, mostly revolving around the flow of execution within EventMachine code. To this point, there were two common stumbling points people seemed to have:
- Ruby developers are not used to treating blocks as true callbacks where they are executing at some point in the future. Blocks are usually yielded by the method they are passed to. Understanding when a block will be called is confusing.
- Understanding how Fibers work and how they can make an asynchronous API appear to be synchronous to the outside world is tricky.
I hope everyone came away a little more knowledgeable about EventMachine and the types of problems it can solve. Here’s the slides for others to peruse. The presentation was recorded and I will link to recordings when I find out about them.
Scalable Ruby Processing with EventMachine (Keynote 2009, 1.2 MB)
Scalable Ruby Processing with EventMachine (Scribd)
Scalable Ruby Processing with EventMachine (Audio MP3, 49MB)
Scalable Ruby Processing with EventMachine (Video MPEG-4, 460MB)
Tags: Ruby
We run three small EC2 instances for content caching purposes at OneSpot. These systems are 32-bit machines with 1.7GB of RAM. Originally we figured even on a small system Varnish could flood a 100Mb line so we wouldn’t need a more expensive, large EC2 instance. This blog post explains why this turned out to be a poor choice.
Executive summary: Varnish really, really wants to run on a 64-bit system. Don’t run it on 32-bit systems if possible.
Varnish wants to memory map the entire cache. This means the entire cache needs to be able to fit into virtual memory. On a 64-bit system, VM is virtually unlimited. On a 32-bit system, processes usually have access to a maximum of 3GB of virtual memory. Since you also need to allocate stack space and other standard process requirements, in practice people don’t recommend more than 2GB of cache space for Varnish on 32-bit systems. Pretty small for a web content cache. If you want Varnish to use an entire disk for a cache, it must run on a 64-bit system.
We had a few minutes of outage recently due to this architecture. We read some Varnish tuning tips and decided to modify our default configuration. Specifically we raised the minimum thread count from 1 to 500. Because, after all, “ idle threads are cheap“. But they are only cheap on 64-bit systems where allocating hundreds of MB for extra stack space is a no brainer! When we rolled out this change, the process ran out of memory and couldn’t allocate the extra threads. Klaxons went off and I rolled back the changes. Over the next few months, we’ll be upgrading our caches to 64 bit so that we don’t need to worry about sizing issues moving forward.
Tags: Software
I’ve been enjoying my holiday break (perhaps a bit too much since I’ve produced no new blog content) but to shake off the cobwebs I’ve signed up to speak at Austin on Rails this month on “Scalable Ruby Processing with EventMachine”. I’ll discuss the advantages of event-driven programming in general, why it’s especially useful to the Ruby world and some of the work I’ve been doing in my spare time on my Evented project. Hope to see you there!
Tags: Ruby
December 1st, 2009 · 1 Comment
Getting concurrency in Ruby is tough: Ruby 1.8 threads are green so they don’t execute concurrently. Ruby 1.9 threads are native but they don’t execute concurrently due to the GIL (global interpreter lock) necessary to ensure thread-safety with native extensions. Only JRuby provides a stable, concurrent Ruby VM today. On top of that, writing thread-safe code is tough – code execution is non-deterministic and so everyone gets it wrong, the code is hard to test and bugs painful to track down.
For these reasons, I would argue that IO-intensive applications need to either use an event-driven application model or a language designed for concurrency like Clojure. Since I like to work with Ruby, the former is the route to follow.
This overview is important to understand because the main deployment pattern with Rails apps is to instantiate 5-10 Rails processes, which can each handle one request at a time. If a request takes 5-10 seconds to process (maybe it is calling Amazon S3 or SimpleDB), that entire Rails process is stuck waiting for the data. Even a multi-threaded Rails application is limited due to the GIL. For this reason, people use a message queue to handle long-running tasks but often that just passes the buck: now the message queue processor is the one stuck for 5-10 seconds instead. You don’t have a user waiting for a response but you still are limited in how fast you can process the queue based on the amount of memory you have and the number of daemon processes you can start.


This is where an event-driven model would help immensely. The fundamental tools at your disposal are NeverBlock and EventMachine. EventMachine provides the reactor, the fundamental “switch” in your application which decides what code is ready to run now, and NeverBlock provides various drop-in replacements for the common Ruby code used for network and IO: mysql and postgres database drivers, tcp sockets, etc. Using these, the message queue processor can process many messages at the same time: there’s never any concurrent execution but as one message performs some IO request, eventmachine and neverblock will seamlessly switch to handle another message while waiting for the IO response. That’s the fundamental difference with threaded code: instead of switching threads at a non-deterministic point in the future, event-driven code only switches when the code tries to perform IO. Your code does not need to be thread-safe because your code will not be interrupted while modifying variables and data structures in memory.
Sounds good, right? Well, a few caveats:
- CPU-intensive processes won’t gain much. There’s still only a single actual thread of execution under the covers so event-driven applications will only take advantage of a single processor/core.
- Your application should run on Ruby 1.9 to take advantage of Fibers. Fibers have been backported to Ruby 1.8 but I encourage you to try Ruby 1.9. Most extensions are Ruby 1.9 safe now and Rails is fully supported on Ruby 1.9. Without Fibers, your application code needs to change dramatically to work as success/error callbacks. With Fibers, your code needs little change and can be written in the more familiar procedural style.
- Application exception handling becomes tricky, just as with threads. It’s easy to lose an exception.
Next time, we’ll take a deeper look into some event-driven code and how it works.
Tags: Ruby · Software
November 2nd, 2009 · 1 Comment
I needed to create a simple, but IO-intensive, thumbnailing service for OneSpot last week. This service acts as a proxy to S3 and so a blocking implementation would not scale well, even if threaded. I wanted to use EventMachine instead. Lessons learned:
- The programming model is a mind twist and takes a long time to understand. The sprinkling of implementation at various layers makes it harder. I’ve spent several days now reading through EventMachine, Thin, Rack and em-http-request source code.
- There’s no non-trivial examples out there. It seems like every example is 10 lines of “Hello World” code with no samples of how to integrate multiple pieces. Ok, here’s a 10 line async web server. Now how do I integrate an async call to the DB? How do I make an async 3rd party web service call?
- There’s no testing support. No libraries for doing async testing and no best practices or suggestions on how to test.
So how can we make things better rather than just complain? I’m going to show you a non-trivial example. In return, I want you to send me more examples. Evented is my new Github repository for EventMachine examples. The first example is that big chunk of code that I puzzled over for the last few days which implements the thumbnailing service using Thin, S3, image_science and em-http-request. But I want more examples and I’d love to hear ideas on how to test this type of code. Leave a comment, send a pull request, and help me help you!
Tags: Ruby
October 16th, 2009 · 6 Comments
After talking about document-oriented databases in general in Part 1, for Part 2 I’ve written some code comparing MongDB 1.1.1, CouchDBX 0.9.1 and Tokyo Tyrant 1.4.32 in an apples to apples test.

The shootout code is on Github. I welcome patches and improvements as long as they don’t bias the tests in favor of any one system.
Results
========== Running Tokyo Tyrant tests
Using rufus-tokyo 1.0.0
user system total real
init 0.000000 0.000000 0.000000 ( 0.013781)
create 19.770000 4.260000 24.030000 ( 39.982273)
query 0.160000 0.030000 0.190000 ( 0.318070)
delete 0.000000 0.000000 0.000000 ( 0.421201)
========== Running MongoDB tests
Using mongo + mongo_ext 0.15.1
user system total real
init 0.000000 0.000000 0.000000 ( 0.005074)
create 54.710000 1.750000 56.460000 ( 57.358498)
query 0.120000 0.010000 0.130000 ( 0.155486)
delete 0.000000 0.000000 0.000000 ( 0.957453)
========== Running CouchDB tests
Using jchris-couchrest 0.23
user system total real
init 0.000000 0.000000 0.000000 ( 0.000007)
create 9.290000 0.560000 9.850000 ( 51.177824)
init is the time required to initialize the database and create any necessary indices. In practice, this number isn’t terribly relevant as this is usually an infrequent operation.
The create operation shows how long it takes for the system to bulk load 200,000 documents. You can see that Tokyo is quite fast while the Mongo client hits the CPU pretty hard. The couchrest client seems more efficient than the other two but the task still takes longer than Tokyo.
The query operation shows how long it takes to perform a non-trivial query against those 200k documents. Both Mongo and Tokyo perform about the same speed although Mongo lazy fetches the results in order to minimize network traffic when used with pagination. Tokyo returns the entire result at once AFAIK. I was not able to complete this test in a weekend using CouchDB because its view layer is so alien to me. I’d welcome help with this task.
The delete operation tests the time required to delete a subset of documents within our set of 200,000. Again, Tokyo comes out on top. Since I couldn’t perform the query in CouchDB I couldn’t delete anything either.
Conclusions? Tokyo has a reputation for being very fast and it appears to be well-founded. Couch is fast for what I could get working – I would be much more concerned about developer training and learning curve with Couch. Mongo is by no means slow but someone has to finish last. I like Mongo as an interesting mix of RDBMS and document technologies – it’s not quite as conventional as Tokyo but not as unconventional as CouchDB with its unique view layer and Erlang underpinnings. What do you think? Leave a comment and let me know!
Tags: Software
We’re looking for a Ph.D-level machine learning specialist who will maintain and improve our content scoring algorithms and codebase at OneSpot. Our current system is based on technologies like Hadoop, Cascading and EC2. The position is full-time in Austin, TX. Please contact me if you or someone you know is looking for this type of job.
Tags: Software
With the closing of FiveRuns, the blog post that was the main page for DataFabric has disappeared. I’m creating this blog post as a replacement.
DataFabric is a Rails plugin/gem that adds sharding support to ActiveRecord. It supports Rails 2.x and is theoretically database independent, although I haven’t tested it personally with anything but MySQL and SQLite. Please see the github home page for the latest code, examples and a more in-depth explanation.
Tags: Rails
September 30th, 2009 · 1 Comment
RubyConf 2009 is taking place in San Francisco November 19-21. I’ll be there and have most of the 18th free if anyone is near SFO and wants to join me in some coffeeshop coding.
Tags: Personal · Ruby
September 1st, 2009 · 6 Comments
MongoDB is a relatively new “schema-free, document-oriented database.” The closest competitor to MongoDB is probably CouchDB or Tokyo Cabinet’s Table database but all three differ in significant ways:
- CouchDB guarantees the ACID properties when saving documents through an MVCC mechanism like postgresql. Tokyo Cabinet provides ACID support via locking, like mysql. Mongo updates documents in place with no real support for concurrency (e.g. optimistic or pessimistic locking). This means Mongo will be much faster for writing and scale horizontally very easily at the expense of guaranteed data consistency. This is a very common tradeoff.
- Couch and Mongo support datatypes for the values in documents but Mongo uses a binary JSON representation and protocol which makes it faster over the wire. Tokyo Cabinet does not support datatypes, except for string and number types in indexes. It does not have native boolean and date types which means you can’t efficiently do queries like “created_at < 1 week ago” although you could store dates and booleans as numbers to work around this limitation.
- Couch and Mongo support more complex structures (arrays, hashes) as values. Tokyo only supports basic datatypes.
- Couch requires you to instantiate views for the queries required by your application. Couch will then auto-index the data required to fetch the view. Tokyo Cabinet and Mongo have a more traditional RDBMS notion of indexes, which are maintained separately from the table.
- CouchDB is written in Erlang while MongoDB is written in C++ and Tokyo Cabinet in C. I’m inclined to trust Erlang more for distributed infrastructure, given its long history in telecom. That said, I have no evidence that the other two are anything but rock solid.
All projects are interesting takes on the traditional RDBMS datastore. CouchDB would be useful where you absolutely must keep ACID and transactions to ensure data integrity but want to avoid the hard-coded schema that a traditional database requires. Semantic web applications come to mind where your objects are just a bag of attributes.
MongoDB would seem to be more designed for applications which need dynamic query functionality with high performance and can sacrifice data integrity to get it – metrics and operational data come to mind. As the MongoDB website says: “High volume, low value data”.
Tokyo Cabinet feels a little more traditional and lower-level, like a layer on top of BerkeleyDB. It’s similar in design in that they are both C libraries and not designed to run as standalone daemons themselves. It would be great for embedded applications.
In my next post, I’ll try out each with their latest Ruby driver and see how they perform in basic usecases. Did I get anything wrong? Leave a comment and let me know!
Update: Tokyo Tyrant does not appear to support transactions and so Tokyo Cabinet cannot guarantee ACID when used as a service.
Tags: Software