Background Processing vs Message Queueing

One common simplification I see engineers make is equating message queueing with background processing. This is what they are missing: message queueing is a superset of background processing. All message processing is done in the background but background processing does not have to be done via message queues.

Take a simple use case: “I want to send a welcome email when a user registers”. Commonly you want to send this email in the background so it does not impact the user’s experience. Do you need to install ActiveMQ, RabbitMQ or Resque to do this? Certainly not.

Message queueing is a fundamental architectural pattern when building complex systems. Your various system components might be written by different teams but they communicate through messages sent via queues. One component can send a message to another component, saying “please send this email”. But message queueing systems have their cost: they are complex because they are designed to be the foundation of your distributed system. They must be deployed and monitored like the rest of your infrastructure; they must be reliable and highly available.

I think that a lot of people install a message queue to perform simple background processing; it doesn’t need to be that complicated. The fundamental question to me is, “Am I communicating between different subsystems or just trying to spin off some work?” The registration email use case comes up almost immediately when building nearly every website. Consider also the case where you want to perform some action that might take 30-60 seconds and have the user’s browser poll for the result. Spinning off a separate thread to perform this work is entirely sufficient and much simpler. This is the reasoning behind my girl_friday project. I want a simple and reliable way to perform background processing without needing the complexity of an MQ system. Let’s examine a few characteristics of girl_friday:

  • In-process – your background processor is part of your Ruby application and has access to the exact same codebase as your webapp. No need to share ActiveRecord models across projects via git or filesystem trickery. No need to deploy or monitor a separate set of processes.
  • Threaded – huge memory savings because you don’t have to spin up other processes which load the exact same code. Threads are notoriously tricky to get correct so girl_friday uses Actors for the equivalent behavior in a simpler and safer API.

I have issues with the other contenders in the space:

  • delayed_job – stores jobs in your RDBMS and polls for jobs which is a terribly unscalable idea. Spins off processes instead of threads.
  • resque – forks a new process for every message. Safe but memory hungry.

The biggest caveat with girl_friday is threading, of course. Typical Ruby deployments aren’t thread-friendly but I’d like to help change that. Rainbows! is thread-friendly, as are all the JRuby app servers. The girl_friday wiki gives more specifics about features and usage. Are there any other dimensions to the problem that I’m missing? Any other projects that solve a similar problem? Post a comment and let me know!

14 thoughts on “Background Processing vs Message Queueing”

  1. What would be your recommended approach to using girl_friday with AMQP in the absence of a reactor? I would like messages from my the event bus to trigger girl_friday jobs, but the AMQP gem runs in em. There are others like bunny but I think they would require polling to get messages. I was thinking I could run em in a thread and stick messages into gf queues as they arrived – is there a better solution that doesn’t require a reactor?

  2. Is your bolded sentence around the right way? Shouldn’t it be message queues are a subset of background processing?

  3. The obvious trade-off is in using a simple thread one of durability. If anything happens to the environment your thread is running on, your request is lost. This risk can be mitigated but nearly all solutions will start to add to the overall complexity.

  4. This is a very simplistic analysis. You forget scalability, work-load distribution, fault-tolerance etc all of which are not going to be satisfied by a plain-threaded solution..unless you write enough code..and then it looks like a messaging product :)

  5. No need to share ActiveRecord models across projects via git or filesystem trickery.

    This is a pretty big no-no when designing a system of services. Each service should be treated as a fully encapsulated interface — they should never connect directly to the main applications datastore.

    Someone brought up the idea of using Sidekiq as a queue between services recently. I.e. one app pushes to it, and another app has workers that subscribes and consumes from it. Thought that was an interesting idea. Any thoughts on that?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>