Ruby Performance 2014

Last year I posted a comparison of various Ruby VMs and how fast they could process N empty jobs. This is the equivalent of pumping out “Hello World” responses in an app server: it’s not very useful for application developers but it’s far more useful than a microbenchmark in determining real Ruby VM performance. Let’s take a look at the most popular three versions available today: MRI 2.1.1, MRI 2.0.0 and JRuby 1.7.11.

Time required to process 50,000 empty jobs with a single Sidekiq process running 25 threads.

Version Time With Logging
2.1.1 46 sec 67 sec
2.0.0 50 sec 70 sec
1.7.11 33 sec 51 sec

 

Like last year, JRuby continues to dominate in raw runtime performance. 2.1.1 shows a small performance advantage over 2.0.

“With Logging” shows some interesting data: just logging the start and finish times of the jobs to the global logger proves to be a major performance hit. The reason is that Ruby’s Logger contains an internal Mutex to ensure that data is logged to the stream atomically. This Mutex becomes a source of contention when 25 threads are processing those no-op jobs. Your first impression might be to optimize the Logger but this is a red herring! During normal execution the logger won’t be as heavily contented because your jobs are actually doing work.

Details:

The actual code is here.

Run on a late 2013 MBP retina with 2.8Ghz Core i7 with 2 cores running on battery. Defaults were used for everything.

java version “1.7.0_45″
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

Sidekiq 3.0!

After tons of work on Sidekiq 2.x for the last 18 months, I decided it was time for some deeper refactoring and cleanup necessitating a major version bump.

Sidekiq 3.0 is the result of three months of hacking, cleanup and community suggestions. There’s an huge amount of stuff in here so hang on to your hats…
Continue reading

Dipping a Toe into Open Source

This is an excerpt from the foreword I wrote for Brandon Hilkert’s new e-book, Build a Ruby Gem.

I was a junior in college when I published my first open source program. It was Fall 1995 and Windows NT 3.5 had this fat and slow interface for launching applications. Windows 95 had an awesome new Start bar and so the answer was obvious: I decided I wanted to learn how to program the new Win32 API and solve my own problem at the same time. I set to write a lighter-weight, fast application launcher in the vein of the Start bar called AppBar.

Continue reading

Happy 2nd Birthday Sidekiq!

Let’s review some numbers, from the 1st birthday:

  • 214,300 downloads
  • 2144 stars
  • 662 closed issues
  • 266 forks
  • 228 closed pull requests
  • 44 versions released
  • 25 Sidekiq Pro customers

Now, on the 2nd birthday:

  • 1,192,259 downloads (wow, huge uptake!)
  • 3535 stars
  • 1420 closed issues
  • 563 forks
  • 380 closed pull requests
  • 74 versions released
  • Over 200 Sidekiq Pro customers

At this point I believe I’ve achieved my goals: build the best background job framework, bar none. With Sidekiq I try to have it all: good performance, easy setup, deep integration with an application framework like Rails and a rich set of functionality. I hope you think I was successful in my efforts.

As always, thank you to my users and keep ‘kiqing!

Don't Forget What's Important

Technologies come and go. We learn and grow as engineers over time but some things are eternal: knowing what is truly important to you is critical in differentiating between a path to misery versus fulfillment. Like your coworkers and your environment, the technology you work with day to day can make a big difference in your job satisfaction.

Continue reading

Advanced Sidekiq: Host-specific Queues

This is the first in a series of posts offering neat tricks to get the most out of Sidekiq.

Recently we rewrote part of The Clymb to process images asynchronously using Sidekiq. The user uploads the image file, it is saved to disk and a job created to process the file. Almost immediately we saw a bunch of retries with the error “Unable to find file xyz.jpg”. We just uploaded the file, how could it not be there?

The problem is that we have multiple app servers and they all run Unicorn and Sidekiq. This means the file can be uploaded to a Unicorn on app-1 and the job processed by a Sidekiq on app-2. The job queue is global to the cluster but the filesystem is local. The solution is a cool hack: use a queue which is processed only by Sidekiq processes on that server.

First we need to tell each Sidekiq process to listen to a queue named after the machine’s hostname. In your config/sidekiq.yml, do this:

---
:verbose: false
:concurrency: 25
:queues:
  - default
  - <%= `hostname`.strip %>

Sidekiq runs the YAML file through ERB automatically so you can easily add the queue dynamically.

Second, we need to configure the jobs to use the queue:

class ImageUploadProcessor
  include Sidekiq::Worker
  sidekiq_options queue: `hostname`.strip

  def perform(filename)
    # process image
  end
end

Now when we create an ImageUploadProcessor job, it will be saved to a queue named after the machine’s hostname and processed by a Sidekiq worker on that machine. Easy!

The Emperor has no Clothes

“In theory, theory and practice are the same. In practice, they are not.” — Albert Einstein

The original Dynamo paper created a wave of interest in the CAP theorem and gave rise to the recent crop of distributed databases: Cassandra, Riak, et al. These systems are generally AP where C can be tuned to provide some guarantee of consistency, i.e. they do their best to provide CAP according to the application’s needs. For instance, you might have a cluster of 5 nodes where a write to the cluster will return success if 3 of the nodes acknowledge the write. The cluster will still be available even if two of the machines fail.

In theory they are a great way to ensure availability to your application in the face of network failures. In practice, I believe these databases are so complex that they often provide less availability than a simpler CP system like a SQL database.
Continue reading

On Ruby, Software and the Internet