Cassandra and Mongo passed the first set of winnowing tests – both in terms of speed and functionality. In order to provide more data for an informed decision, I subjected both to a series of high throughput tests. The test suite was a total of 120 million operations (insert, update, get) – 12 runs of 10 million requests.
For those impatient for the ultimate conclusion, the executive summary is: Cassandra writes are faster, Mongo reads are faster. But what is interesting is how the latencies evolve over time, i.e. over the course of 120 million requests.
Overview of test suite
- Remove all disk data of the persistent store with rm -rf
- Start the NoSQL server
- Maintain a constant throughput of 20 or 50 concurrent client threads with no rampup time
- Insert N records
- Get N records – this will reflect any caching or compaction (e.g. Cassandra SSTables)
- Get N records – this should reflect already compacted tables – should be faster
- Get N records – should correspond to previous get
- Update N keys with different sized values (value seed). We want to make sure we’re exercising Mongo compaction
- Repeat the three get runs again
Results Commentary
When one result is faster than another, the important question to always ask is: how important is the difference? Does it really matter? Is the slower system still fast enough? If the lower value still satisfies your requirements, then the difference is of little signficance and other factors weigh more heavily on the ultimate decision.
In order to interpret results, it is important to have a basic understanding of some statistics. You should not rely solely on the mean, but also pay particular attention to latency percentiles. Percentiles give you an idea of the distribution of results which is especially important when predictability is required. Your SLA might say: P reads MUST be less than 10ms where P is the percentage, e.g. 99% or 100%. This of course raises the question what is to be done for those operations that exceed N, but that is another discussion. As you can see, there are some real whoppers – note the Max value for put – 37 seconds for MongoDB and the 175 seconds for Cassandra!
As the tables below indicate, Cassandra was a winner for puts – both inserts and updates. For the initial 10M puts, the distribution is interesting – 50% of Mongo puts were under 1ms while for Cassandra it was 2ms. But for the 90-th percentile, Mongo took 20ms but Cassandra was 4ms.
For the subsequent two batches of 10M puts (inserts), Cassandra’s throughput was 4054 and 5868 – quite a bit of variability! Note the max value for a put was 174 seconds – must have been one heck of a GC! Mongo’s two insert batches ranged from 1753 to 2197 puts per second. 90% of Cassandra puts are finishing under 4ms, whereas 90% of Mongo puts are under 17ms.
Cassandra’s superior write numbers are not surprising since it is optimized for faster writes than reads. See here. In order to avoid expensive disk seeks, writes are appended to sequential files called SSTables. However the hit is taken on reads, where potentially the system has to examine several SSTables for updated values. Unfortunately, for our needs it fast writes were of more importance.
For reads the situation is quite different and Mongo comes out ahead. For the first three get batches Cassandra is giving us between 2300-2600 gets per second. Mongo is in the predictable 3100s. Both of these fall within our target SLA and are roughly comparable.
However, its a completely different story for reads after we do inserts! Look at the numbers: Mongo is slightly slower at 2999 and 3099, but Cassandra drops tenfold to 294 and 350! Que caramba! I’m assuming this is because of the need to access multiple SSTables, but I would’ve assumed that at some point the values were merged into one file.
Perhaps there’s some tuning that needs to be done for Cassandra though I saw nothing obvious in my initial look at the documentation. All I can say with confidence is, is that un-tuned Cassandra get operations drastically slow down. I did invoke an explicit compaction using nodetool, and throughput for next 10 millions gets did improve to 1000/sec. But the caveat was that Cassandra documentation did not recommend this since automated compactions will not be run after a manual one! By the way, one of Mongo’s strong suits is that they strive to minimize the amount of configuration required to get a system running.
Context
In order to establish a baseline for the vendor’s capabilities, a non-clustered server was used. For Cassandra this was single node cluster, for Mongo simply one server with no replication. Cluster tests were run for functionality.
Test Configuration
- Number of requests: 10,000,000
- Object Mapper: Jackson JSON
- Key size: 32
- Value size: average is 1000 bytes, range from 100 to 1900
- Centos 5.5 Linux
- Intel Xeon 2.4 MHz 64 bit processor
- 8 processors, 2 cores each
- 8 GB RAM
- The client pinging the server averaged 0.2 ms
int baseSize = 100 ; int range = 1800 ; Random random = new Random(valueSizeSeed) int valueSize = random.nextInt(range) + baseSize ;
MongoDB
- Version: 2.0.0
- Journaling: turned on
Test Req/Sec Millis 50% 90% 99% 99.5% 99.9% Max Mean Err Fail Thr ValSeed Put 2114 4729454 1 20 43 60 1576 36232 9.442 0 0 20 1776 Get 3182 3142258 2 10 18 28 273 4684 6.268 0 0 20 Get 3182 3142850 2 10 18 26 229 3950 6.269 0 0 20 Get 3106 3219539 2 33 88 144 2120 3885 16.060 0 0 50 Put 1753 5706060 1 17 76 278 1985 44472 11.395 0 0 20 1812 Put 2197 4552045 1 17 52 182 1293 37299 9.087 0 0 20 1846 Get 2999 3333966 2 11 19 39 308 4380 6.651 0 0 20 Get 3039 3290874 2 10 19 41 289 4676 6.565 0 0 20 PutGet 907 11027045 14 28 70 150 2531 8886 22.036 0 0 20 1861 Get 2992 3342034 2 11 20 40 299 4299 6.666 0 0 20 Get 2975 3361762 2 11 20 38 301 4478 6.707 0 0 20 Get 2957 3381393 2 34 112 166 2160 4363 16.871 0 0 50
Cassandra – Version: 0.8.6
Test Req/Sec Millis 50% 90% 99% 99.5% 99.9% Max Mean Err Fail Thr ValSeed Put 6045 1654219 2 4 16 24 160 4169 3.288 0 0 20 1776 Get 2334 4285208 1 14 111 150 427 9003 8.552 0 0 20 Get 2380 4202468 2 10 110 144 290 8380 8.387 0 0 20 Get 2616 3822749 8 40 177 226 1559 24092 19.069 0 0 50 Put 4054 2466583 2 4 17 26 194 174614 4.915 0 0 20 1812 Put 5768 1733745 2 4 16 24 172 4190 3.446 0 0 20 1846 Get 294 33972166 43 141 371 567 1737 67276 67.928 0 0 20 Get 350 28551712 39 116 256 354 1629 37371 57.087 0 0 20 PutGet 285 35124479 40 124 388 801 2384 456321 70.232 0 0 20 1861 Get 210 47730677 79 182 341 418 1141 36368 95.446 0 0 20 Get 211 47305663 79 180 335 409 1097 104592 94.595 0 0 20 Get 249 40157243 175 339 598 730 2164 77675 200.751 0 0 50
Variability Examples
Here are some examples demonstrating the variability in results. The examples are from identical tests in the same context.
Mongo
First 10M puts Test Req/Sec Millis 50% 90% 99% 99.5% 99.9% Max Mean Err Fail Thr Put 2137 4679117 1 21 41 49 1388 40931 9.338 0 0 20 Put 2156 4639027 1 21 41 48 1450 36412 9.258 0 0 20 Put 2114 4729454 1 20 43 60 1576 36232 9.442 0 0 20 Put 2079 4810938 1 20 42 55 1354 38436 9.605 0 0 20 First 10M gets Test Req/Sec Millis 50% 90% 99% 99.5% 99.9% Max Mean Err Fail Thr Get 3187 3138189 2 10 18 25 225 4665 6.260 0 0 20 Get 3115 3210200 2 10 18 26 258 13211 6.403 0 0 20 Get 3182 3142258 2 10 18 28 273 4684 6.268 0 0 20 Get 3141 3184198 2 10 18 27 271 22285 6.352 0 0 20
December 1, 2011 at 10:28 |
Your blog is pretty informative and interesting!
Did you post these observations about 10 fold drop in performance to apache or datastax community?
It will be interesting to know whether it is a tuning issue or a limitation.
January 18, 2012 at 13:49 |
It would be great if this test can be repeated with latest cassandra 1.0.x branch and latest MongoDB 2.0.x branch. Both have evolved quite a bit in this course of time and cassandra gained some optimizations for read performance.
It would be great to see how they compare now. May be a rolling comparison of these two monster DBs could be just what an architect would need to asses trends in performace optimizations and make an informed decision for his needs.
Nevertheless the effort to compare them deserves applause and appreciation.
Cheers.
January 24, 2012 at 18:33 |
I´ve been meaning to compare the latest versions when I get some time. Currently excitingly examining DynamoDB from Amazon.
November 26, 2012 at 18:12 |
Hi Andre – I was wondering if read stats(dropping 10 fold) still apply to latest Cassandra/MongoDB branch?
(if you played around and compared with latest branches..)
July 25, 2014 at 03:25 |
Ha, ha – need to update tests. So many NoSQL systems and so little time.
May 5, 2013 at 19:12 |
About the slowdown after a ton of inserts.
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts