One common simplification I see engineers make is equating message queueing with background processing. This is what they are missing: message queueing is a superset of background processing. All message processing is done in the background but background processing does not have to be done via message queues.
Take a simple use case: “I want to send a welcome email when a user registers”. Commonly you want to send this email in the background so it does not impact the user’s experience. Do you need to install ActiveMQ, RabbitMQ or Resque to do this? Certainly not.
Message queueing is a fundamental architectural pattern when building complex systems. Your various system components might be written by different teams but they communicate through messages sent via queues. One component can send a message to another component, saying “please send this email”. But message queueing systems have their cost: they are complex because they are designed to be the foundation of your distributed system. They must be deployed and monitored like the rest of your infrastructure; they must be reliable and highly available.
I think that a lot of people install a message queue to perform simple background processing; it doesn’t need to be that complicated. The fundamental question to me is, “Am I communicating between different subsystems or just trying to spin off some work?” The registration email use case comes up almost immediately when building nearly every website. Consider also the case where you want to perform some action that might take 30-60 seconds and have the user’s browser poll for the result. Spinning off a separate thread to perform this work is entirely sufficient and much simpler. This is the reasoning behind my girl_friday project. I want a simple and reliable way to perform background processing without needing the complexity of an MQ system. Let’s examine a few characteristics of girl_friday:
- In-process – your background processor is part of your Ruby application and has access to the exact same codebase as your webapp. No need to share ActiveRecord models across projects via git or filesystem trickery. No need to deploy or monitor a separate set of processes.
- Threaded – huge memory savings because you don’t have to spin up other processes which load the exact same code. Threads are notoriously tricky to get correct so girl_friday uses Actors for the equivalent behavior in a simpler and safer API.
I have issues with the other contenders in the space:
- delayed_job – stores jobs in your RDBMS and polls for jobs which is a terribly unscalable idea. Spins off processes instead of threads.
- resque – forks a new process for every message. Safe but memory hungry.
The biggest caveat with girl_friday is threading, of course. Typical Ruby deployments aren’t thread-friendly but I’d like to help change that. Rainbows! is thread-friendly, as are all the JRuby app servers. The girl_friday wiki gives more specifics about features and usage. Are there any other dimensions to the problem that I’m missing? Any other projects that solve a similar problem? Post a comment and let me know!
8 responses so far ↓
1 user // May 4, 2011 at 7:36 pm
This seems to be an interesting solution for simple things.
http://railstips.org/blog/archives/2011/05/04/eventmachine-and-passenger/
2 Mike Perham // May 4, 2011 at 9:53 pm
That seems like a terrible solution to me. Introducing EventMachine into a background thread is a bizarre hack.
3 Cyril Rohr // May 17, 2011 at 4:02 pm
Regarding the last comment, it is actually something that I’ve seen in the wild before (e.g. https://github.com/futurechimp/enigmamachine/blob/master/lib/enigmamachine.rb#L99-109), and when you think about it it just reabsorbs the thread into the main one once the reactor has been correctly started (no #join on the thread).
It looks like a hack, but it works well in practice, so I would be happy to know if this is a bad idea or not!
4 Mike Perham // May 18, 2011 at 11:28 am
Mixing threads and reactors seems like a bad idea. It may work but I don’t pretend to understand whether this might be brittle or error prone.
5 RabbitMQ : highly reliable, scalable and portable messaging system | EzeeTweet // Jun 15, 2011 at 5:53 am
[...] Background Processing vs Message Queueing (mikeperham.com) [...]
6 Nathan // Jun 19, 2011 at 11:31 am
What would be your recommended approach to using girl_friday with AMQP in the absence of a reactor? I would like messages from my the event bus to trigger girl_friday jobs, but the AMQP gem runs in em. There are others like bunny but I think they would require polling to get messages. I was thinking I could run em in a thread and stick messages into gf queues as they arrived – is there a better solution that doesn’t require a reactor?
7 RabbitMQ : highly reliable, scalable and portable messaging system | Blog – Infinitum Technologies // Jul 1, 2011 at 11:32 am
[...] Background Processing vs Message Queueing (mikeperham.com) [...]
8 Pavel // Nov 20, 2011 at 2:13 am
When I need only few simple background jobs I make like this https://gist.github.com/1379969
Leave a Comment