MongoDB is a relatively new “schema-free, document-oriented database.” The closest competitor to MongoDB is probably CouchDB or Tokyo Cabinet’s Table database but all three differ in significant ways:
- CouchDB guarantees the ACID properties when saving documents through an MVCC mechanism like postgresql. Tokyo Cabinet provides ACID support via locking, like mysql. Mongo updates documents in place with no real support for concurrency (e.g. optimistic or pessimistic locking). This means Mongo will be much faster for writing and scale horizontally very easily at the expense of guaranteed data consistency. This is a very common tradeoff.
- Couch and Mongo support datatypes for the values in documents but Mongo uses a binary JSON representation and protocol which makes it faster over the wire. Tokyo Cabinet does not support datatypes, except for string and number types in indexes. It does not have native boolean and date types which means you can’t efficiently do queries like “created_at < 1 week ago” although you could store dates and booleans as numbers to work around this limitation.
- Couch and Mongo support more complex structures (arrays, hashes) as values. Tokyo only supports basic datatypes.
- Couch requires you to instantiate views for the queries required by your application. Couch will then auto-index the data required to fetch the view. Tokyo Cabinet and Mongo have a more traditional RDBMS notion of indexes, which are maintained separately from the table.
- CouchDB is written in Erlang while MongoDB is written in C++ and Tokyo Cabinet in C. I’m inclined to trust Erlang more for distributed infrastructure, given its long history in telecom. That said, I have no evidence that the other two are anything but rock solid.
All projects are interesting takes on the traditional RDBMS datastore. CouchDB would be useful where you absolutely must keep ACID and transactions to ensure data integrity but want to avoid the hard-coded schema that a traditional database requires. Semantic web applications come to mind where your objects are just a bag of attributes.
MongoDB would seem to be more designed for applications which need dynamic query functionality with high performance and can sacrifice data integrity to get it – metrics and operational data come to mind. As the MongoDB website says: “High volume, low value data”.
Tokyo Cabinet feels a little more traditional and lower-level, like a layer on top of BerkeleyDB. It’s similar in design in that they are both C libraries and not designed to run as standalone daemons themselves. It would be great for embedded applications.
In my next post, I’ll try out each with their latest Ruby driver and see how they perform in basic usecases. Did I get anything wrong? Leave a comment and let me know!
Update: Tokyo Tyrant does not appear to support transactions and so Tokyo Cabinet cannot guarantee ACID when used as a service.
8 responses so far ↓
1 Flinn // Sep 1, 2009 at 10:00 pm
I think Tokyo Tyrant is a better comparison that Cabinet here. For that, yes you lose transactions but you get things like master/slave and master/master replication, user defined functions (in lua) and more idiomatic Ruby interfaces, rufus-tokyo and ruby-tokyotyrant (mine).
Also, no mention on performance? I believe the latest numbers show Tokyo Tyrant, Mongo, CouchDB in that order.
Important to note, CouchDB uses map/reduce for it’s views, dynamic queries are (supposedly) slower. Mongo and Tyrant both support indexes in the RDBMS sense and I believe both support full text search. Mongo allows for indexes (I believe a limit of 10 per collection at the moment). Tyrant has no limitation on indexes (that I know of) but doesn’t support collections… meaning completely irregular data and as far as I know this degrades the value of indexes, though the simple answer is to run a daemon per collection. My guess is in both Mongo and Tyrant a full table scanning is probably faster than CouchDB (someone correct me if I’m wrong).
2 Justin Leitgeb // Sep 1, 2009 at 10:45 pm
Great to see these non-relational systems getting attention. I’ve found Tokyo Cabinet to be a very handy tool for building small apps with Sinatra, and also for building out Rails apps where constant-time lookup can be used instead of queries on an index. Looking forward to the follow-up!
3 dm // Sep 2, 2009 at 8:15 am
Very good summary.
MongoDB supports concurrency — it supports atomicity at the single document level only (not complex transactions like RDBMS) — but i believe couchdb atomicity is single document too (could be wrong someone please correct me)?
MongoDB relaxes the ‘D’ (durability) in acid for performance — although can be durable by using its replication especially over-the-wan replication — then it is highly durable IMO.
Agree with @Flinn that best cross compare might be Tyrant as it is client/server like couch and mongo.
4 Stephen // Sep 3, 2009 at 9:12 pm
Nitpicking, but note that MySQL uses MVCC with the InnoDB and Falcon engines.
5 Shane K Johnson » Blog Archive » How I learned to say ‘No’ to SQL // Sep 30, 2009 at 9:23 am
[...] blog post provides a nice comparison of the differences between CouchDB and Tokyo [...]
6 Document-oriented Database Shootout Part 2: Performance // Oct 16, 2009 at 9:40 pm
[...] talking about document-oriented databases in general in Part 1, for Part 2 I’ve written some code comparing MongDB 1.1.1, CouchDBX 0.9.1 and Tokyo Tyrant [...]
7 Jumping aboard the NOSQL train | crealytics Blog // Mar 8, 2010 at 10:41 am
[...] feature comparison you could possibly wish for. We did some research into some of the fairly good comparisons out there but they tend to be outdated pretty quickly with this blazingly fast evolving topic, so [...]
8 Andy // Jun 4, 2010 at 9:07 am
> Couch and Mongo support more complex structures (arrays, hashes) as values. Tokyo only supports basic datatypes.
Tokyo actually does support hashes (via “table” extension which has nothing to do with rigid RDBMS tables) but they are “flat” (key/value pairs with a primary key). Still, this is flexible enough for most cases.
Leave a Comment