MongoDB or How I learned to stop worrying and love SQL

October 9, 2012

We’ve been using MongoDB for a year and a half at ThoughtLeadr. During that time we’ve gone from elation to depression using this trendy NoSQL datastore. Based on the documentation, it’s not hard to see why you’d get pulled in. A schemaless, performant database that can utilize both sharding and replica sets to mantain high availability at nearly limitless scale. Well, I guess I should have known that when something is too good to be true, it probably isn’t. Here’s the lowdown of MongoDB’s web-scale breakdown.

Global lock

I’ll admit that I didn’t realize the global lock (now database level lock) was such a major issue when I first started using MongoDB. I’ve never written database internals myself before. While that doesn’t excuse me from doing my homework, I bought into 10Gen’s benchmark’s page. Oh wait, they don’t have benchmarks? Strange, I remember reading all these great articles about MongoDB’s performance when I picked it up. Way Backmachine to the rescue. This is one of the most frustrating aspects of working with MongoDB, the global lock has far greater repercussions in production than what you see in the benchmarks. I’ll get into specifics below but the global lock is a reoccurring theme throughout, a serious flaw in any database design, that should have be represented more honestly.

MapReduce is useless at web scale

When you run a MapReduce job against a database, the global write lock stops any other process from manipulating that data. Meaning if you run a moderate number of MapReduces per hour you massively degrade application performance against those collections. Plus, you can block the rest of replica set from sync’ing in a timely fashion, which can cause “primary” database switches and accidental loss of data.

One of the core principles of MapReduce is the ability to get concurrent data processing out of a system to analyze large datasets quickly. With MongoDB, MapReduces run inside a single-threaded Javascript VM eliminating concurrent processing and slowing web-scale data processing down to a crawl.

Sysadmin tooling is painful

Every major sysadmin tool is a blocking process. Most of this falls back on the global write lock issues or immature tooling but in practical terms if you need to modify your database structure in production you’ll be forced to have downtime.

Here’s a great example, database compaction only works if you are under 50% disk utilization. Let me repeat that again. Database compaction only works if you’re not using more than half your disk. Have you ever created a collection only to realize later that you don’t really need it? If you have over half your disk in use, you’ll need to take one of the replica set members offline and delete the entire database manually then bring it back up to sync all the data from the primary. Only after it has completed sync’ing, a process that can take days, can you set that system to primary and do the same with the other members of the replica set.

A woeful lack of production grade tooling

We never really felt the need to use an ORM tool with MongoDB since its JSON data structures map nicely to both Python and Haskell (our core languages) analogs. However, after using MongoDB in production for over a year, migrations became a real pain. There’s no mature tools to simplify this process – forcing any adopter to eventually roll their own migration system. Plus, there wasn’t a clear best practice for migrating objects in storage. Do you use a framework to lazy migrate objects as you need them? Or run a migration script to update all the data at one time (updating some objects even if you never use them again)? The answer is both, heavily dependent on the situation. When you spend most of your time in the NoSQL world, it’s easy to forget migration support is built into SQL with the ALTER TABLE command.

What’s next?

Honestly, we started using MongoDB because of its great documentation and blazing developer speed (amazingly fast to get up and developing features). The problems only crop up when your product has real traction, real data, and real scale. Then it becomes apparent that MongoDB isn’t ready for prime time. We’ve already switched over to Percona for our production metadata database but we’re not done with NoSQL. Our full database stack still includes Redis and Riak since we have a need for both fast IO and big data respectively.