Archive for the ‘Riak’ Category

No SQL Data Store Evaluation

October 18, 2011

Recently I’ve been doing some extensive and intensive evaluation of various No SQL implementations capable of storing 600,000 key/value items with a latency preferably under 50 milliseconds. Extensive in the sense of examining many providers and intensive in drilling down to test key features of each system. The specific requirements precluded looking at all stores, but a number of quite unique systems – each with its strong personality – were examined in substantial detail.

The initial requirements were:

  • Eventually be able to store six months of user profiles for ad engagements
  • Insertion of bout 3-5 million records daily – an equivalent amount had to be deleted
  • Key/value access – no need for range or index queries
  • All keys were random MDA hashes
  • Retrieval latency under 50 ms
  • Timestamp expiration of records
  • Reads needed to be very fast – writes could be batched and delayed
  • Ability to manipulate nodes of to live cluster, e.g. add a new node without restart
In order to execute the evaluation I developed a suite of structured high load tests that could be run against all providers.  Besides raw benchmarks there were also tests for key technical issues such as failover behavior. It was not possible for all implementations to be comparable for all features. Obviously providers made different design and feature trade-offs – that’s their unique personality.
Some topics examined:
  • Predictability – we need low variability in response time – latency percentiles are helpful here.
  • Repeatibility – we want to run the same test in the same context repeatedly with same results subject to an obvious margin or error. Establishing a consistent margin is a difficult chore.
  • Consistency model – eventual consistency and/or immediate. How is the CAP theorem handled?
  • Failover – when a node dies what happens to reads and writes?
  • Sharding – ability to shard data and key partitioning strategies.
  • Compaction policy – how is data stored, how are updates and deletions handled (tombstones). If implemented in Java, how much of a hist is garbage collection? I
The following systems were evaluated, with the first two as  the final candidates.
  • Cassandra
  • MongoDB
  • Membase
  • Riak
  • Redis
In addition the idea of a bank of caching servers fronting the NoSQL stores was examined – Memcached, Ehcache REST server, and custom REST servers with a pluggable cache engine (Ehcache, Infinispan). In some cases the NoSQL stores were so performant (MongoDB) that a separate caching layer was not immediately necessary. Perhaps under greater volume it would be. They key was to have a provisional caching strategy as a reserve.

MongoDB is very fast and reliable and it made the first cut. Membase also proved fast, but it had some reliability issues especially in server restarts. Cassandra provided the full set of features we needed though it was significantly slower than MongoDB.  Since we were testing with a constant client thread count of 20 or 50, being able to stay alive and keep up with this rate was important.

One of the frustrating aspects was the difficulty in obtaining adequate or minimal hardware. Since the raison de etre of the project was high volume, it was important to test with as large numbers as possible. Differences between low and high volume scenarios can give you drastically different results and lead you to costly wrong conclusions. There is also a natural tension between reasonable turnaround time for results and the volume and rate of data testing.  For example, to run a standard  melange of CRUD operations (inserts, gets, updates) 14 million times, MongoDB takes 20 hours versus 73 hours for Cassandra. That’s a lot of time to wait for the results of a configuration toggle switch!

One of the great parts of the project was the ability to study in detail key features of  systems and then implement test scenarios that exercised these features. A good example of a praxis: applying solidly founded ideas into practice. One has to first discover and understand what has to be tested. Comparing eventual consistency systems such as Cassandra and Riak is a challenging endeavor that stimulates the intellect. For instance, how do Bloom filters (slight risk of false positives) obviate the need for disk access? How do go about testing to see if this is true? Bloom filters are even used in DNA sequencing – see here. Then throw in MongoDB’s differing but overlapping approach, and you get quite a rich NoSQL stew. Documentation and articulation of key features was an important evaluation criterion.