Archive for the ‘Spring’ Category

Spring Configuration – Selecting An Alternate Implementation

November 9, 2011

A common recurring pattern in software development is the need to select at runtime a specific instance of an interface. This instance can either be a distinct implementation class of an interface or the same class  but instantiated with different properties. Spring provides unparalleled abilities to define different bean instances. These can be categorized as following:

  • Each bean is a different implementation class of the interface.
  • Each bean is the same implementation class but has different configuration.
  • A mixture of the two above.

The canonical example is selecting a mock implementation for testing instead of the actual target production implementation. However there are often business use cases where alternate providers need to be selectively activated.

The goal is to externalize the selection mechanism by providing a way to toggle the desired bean name. We want to avoid  manually commenting/uncommenting bean names inside a Spring XML configuration file. In other words, the key question is: how to toggle the particular implementation?

A brief disclaimer note: this pattern is most applicable to Spring 3.0.x and lower. Spring 3.1 introduces some exciting new features such as bean definition profiles dependent upon different environments. See the following articles for in-depth discussions:

There are two variants of this pattern:

  • Single Implementation – We only need one active implementation at runtime.
  • Multiple Implementations – We need several implementations at runtime so the application can dynamically select the desired one.
Assume we have the following interface:
  public interface NoSqlDao<T extends NoSqlEntity>  {
     public void put(T o) throws Exception;
     public T get(String id) throws Exception;
     public void delete(String id) throws Exception;
  }

  public interface UserProfileDao extends NoSqlDao<UserProfile> {
  }

Assume two implementations of the interface:

  public class CassandraUserProfileDao<T extends UserProfile>
    implements UserProfileDao

  public class MongodbUserProfileDao<T extends UserProfile>
    implements UserProfileDao

Single Loaded Implementation

In this variant of the pattern, you only need one implementation at runtime. Let’s assume that the name of the bean we wish to load is userProfileDao.

  ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext.xml");
  UserProfileDao userProfileDao = context.getBean("userProfileDao",UserProfileDao.class);

The top-level applicationContext.xml file contains common global beans and an import statement for the desired provider. The value of the imported file is externalized as a property called providerConfigFile. Since each provider file is mutually exclusive, the bean name is the same in each file.

  <beans>
    <bean id="propertyConfigurer"
          class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
      <property name="location" value="classpath:context-PropertyOverrideConfigurer.properties" />
      <property name="systemPropertiesModeName" value="SYSTEM_PROPERTIES_MODE_OVERRIDE" />
    </bean>
    <import resource="${providerConfigFile}"/>
  </beans>

The provider-specific configuration files are:

  applicationContextContext-cassandra.xml
  applicationContextContext-mongodb.xml
  applicationContextContext-redis.xml
  applicationContextContext-riak.xml
  applicationContextContext-membase.xml
  applicationContextContext-oracle.xml

For example (note the same bean name userProfileDao):

  applicationContext-cassandra.xml

    <bean id="userProfileDao" class="com.amm.nosql.dao.cassandra.CassandraUserProfileDao" >
      <constructor-arg ref="keyspace.userProfile"/>
      <constructor-arg value="${cassandra.columnFamily.userProfile}"/>
      <constructor-arg ref="userProfileObjectMapper" />
    </bean> 

  applicationContext-mongodb.xml

    <bean id="userProfileDao" class="com.amm.nosql.dao.mongodb.MongodbUserProfileDao">
      <constructor-arg ref="userProfile.collectionFactory" />
      <constructor-arg ref="mongoObjectMapper" />
    </bean>

At runtime you need to specifiy the value for the property providerConfigFile.  Unfortunately with Spring 3.0. this has to be a system property and cannot be specified inside a properties file! This means it will work for a stand-alone Java application but not for a WAR unless you pass the value externally to the web server as a system property. This problem has been allegedly fixed in Spring 3.1 (I didn’t notice it working for 3.1.0.RC1). For example:

  java
    -DproviderConfigFile=applicationContextContext-cassandra.xml
    com.amm.nosql.cli.UserProfileCli

Multiple Loaded Implementations 

With this variant of the pattern, you will need to have all implementations loaded into your application context so you can later decide which one to choose. Instead of one import statement,  applicationContext.xml is imports all implementations.

  <import resource="applicationContextContext-cassandra.xml />
  <import resource="applicationContextContext-mongodb.xml />
  <import resource="applicationContextContext-redis.xml />
  <import resource="applicationContextContext-riak.xml />
  <import resource="applicationContextContext-membase.xml />
  <import resource="applicationContextContext-oracle.xml />

Since you have one namespace, each implementation has to have a unique bean name for the UserProfileDao implementation. Using our previous example:

applicationContext-cassandra.xml

  <bean id="cassandra.userProfileDao" class="com.amm.nosql.dao.cassandra.CassandraUserProfileDao" >
    <constructor-arg ref="keyspace.userProfile"/>
    <constructor-arg value="${cassandra.columnFamily.userProfile}"/>
    <constructor-arg ref="userProfileObjectMapper" />
  </bean> 

applicationContext-mongodb.xml

  <bean id="mongodb.userProfileDao" class="com.amm.nosql.dao.mongodb.MongodbUserProfileDao">
    <constructor-arg ref="userProfile.collectionFactory" />
    <constructor-arg ref="mongoObjectMapper" />
  </bean>

Then inside your Java code you need to have a mechanism to select your desired bean, e.g. load either cassandra.userProfileDao or mongodb.userProfileDao. For example, you could have a test UI containing a dropdown list of all implementations. Or you might have a case where you even had a need to access two different NoSQL stores via a UserProfileDao interface.

Advertisements

MongoDB and Cassandra Cluster Failover

October 21, 2011

One of the most important features of a scalable clustered NoSQL store is how to handle failover. The basic question is: is failover seamless from the client perspective? This is not always immediately apparent from vendor documentation especially for open-source packages where documentation is often wanting. The only way to really know is to run your own tests – to both verify vendor claims and to fully understand them. Caveat emptor – especially if the product is free!

Mongo and Cassandra use fundamentally different approaches to clustering. Mongo is based on the classical master/slave model (similar to MySQL) whereas Cassandra is a peer-to-peer system modeled on the eventual consistency paradigm pioneered by Amazon’s Dynamo system. This difference has specific ramifications regarding failover capabilities. The design trade-offs regarding the CAP theorem are described very well in the dbmusings blog post Overview of the Oracle NoSQL Database (Section CAP).

For Mongo, the client can only write to the master and therefore it is a single point of failure. Until a new master is elected from the secondaries, the cluster will not be reachable for writes.

For Mongo reads, you have two options: either talk to the master or to the secondaries.  The default  is to read from the master, and you will be subject to the same semantics as for writes. To enable reading from secondaries, you can call Mongo.slaveOk() for you client driver. For details see here.

For Cassandra, as long  as your selected consistency policy is satisfied, failing nodes will not prevent client access.

Mongo Setup

I’ll illustrate the issue with a simple three node cluster. For Mongo, you’ll need to create a three-node replica set which is well described on the Replica Set Tutorial page.

Here’s a convenience shell script to launch a Mongo replica set.

  dir=/work/server-data/mongo-cluster
  mv nohup.out old-nohup.out
  OPTS="--rest --nojournal --replSet myReplSet"
  nohup mongod $OPTS --port 27017 --dbpath $dir/node0  &
  nohup mongod $OPTS --port 27018 --dbpath $dir/node1  &
  nohup mongod $OPTS --port 27019 --dbpath $dir/node2  &

Cassandra Setup

For Cassandra, create a keyspace with replication factor of 3 in your schema definition file.

  create keyspace UserProfileKeyspace
    with strategy_options=[{replication_factor:3}]
    and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

Since I was using the standard Java Hector client along with Spring, I’ll highlight some of the key Spring bean configurations. The important point to note is the consistency policy must be quorum which means that for an operation (read or write) to succeed, two out of the three nodes must respond.

Properties File

  cassandra.hosts=host1,host2,host3
  cassandra.cluster=MyCluster
  cassandra.consistencyLevelPolicy=quoromAllConsistencyLevelPolicy

Application Context File

  <bean id="userProfileKeyspace"
        class="me.prettyprint.hector.api.factory.HFactory"
        factory-method="createKeyspace">
    <constructor-arg value="UserProfileKeyspace" />
    <constructor-arg ref="cluster"/>
    <property name="consistencyLevelPolicy" ref="${cassandra.consistencyLevelPolicy}" />
  </bean>

  <bean id="cluster"
        class="me.prettyprint.cassandra.service.ThriftCluster">
    <constructor-arg value="${cassandra.cluster}"/>
    <constructor-arg ref="cassandraHostConfigurator"/>
  </bean>

  <bean id="cassandraHostConfigurator"
        class="me.prettyprint.cassandra.service.CassandraHostConfigurator">
     <constructor-arg value="${cassandra.hosts}"/>
  </bean>

  <bean id="quorumAllConsistencyLevelPolicy"
        class="me.prettyprint.cassandra.model.QuorumAllConsistencyLevelPolicy" />

  <bean id="allOneConsistencyLevelPolicy"
        class="me.prettyprint.cassandra.model.AllOneConsistencyLevelPolicy" />

Test Scenario

The basic steps of the test are as follows. Note the Mongo example assumes you do not have slaveOk turned on.

  • Launch cluster
  • Execute N requests where N is a large number such as 100,000 to give your cluster time to fail over. The request is either a read or write.
  • While your N requests are running, kill one of the nodes. For Cassandra this can be any node since it is peer-to-peer. For Mongo, kill the master node.
  • For Cassandra, there will be no exceptions. If your requests are inserts, you will be able to subsequently retrieve them.
  • For Mongo, your requests will fail until the secondary is promoted to the master. This happens for both writes and reads. The time window is “small”, but depending upon your client request rate, the number of failed requests can be quite a few thousand! See the sample exceptions below.
With Mongo, the client can directly access only the master node. The secondaries can only be accessed by the master and never directly by the client. With Cassandra, as long as the minimum number of nodes as specified by the quorum can be reached, your operation will succeed.

Mongo Put Exception Example

I really like the message “can’t say something” – sort of cute!

com.mongodb.MongoException$Network: can't say something
    at com.mongodb.DBTCPConnector.say(DBTCPConnector.java:159)
    at com.mongodb.DBTCPConnector.say(DBTCPConnector.java:132)
    at com.mongodb.DBApiLayer$MyCollection.update(DBApiLayer.java:343)
    at com.mongodb.DBCollection.save(DBCollection.java:641)
    at com.mongodb.DBCollection.save(DBCollection.java:608)
    at com.amm.nosql.dao.mongodb.MongodbDao.put(MongodbDao.java:48)

Mongo Get Exception Example

com.mongodb.MongoException$Network: can't call something
    at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:211)
    at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:222)
    at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:231)
    at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:303)
    at com.mongodb.DBCursor._check(DBCursor.java:360)
    at com.mongodb.DBCursor._next(DBCursor.java:442)
    at com.mongodb.DBCursor.next(DBCursor.java:525)
    at com.amm.nosql.dao.mongodb.MongodbDao.get(MongodbDao.java:38)

Mongo Get with slaveOk() 

If you invoke Mongo.slaveOk() for your client driver, then your reads will not fail if a node goes down. You will get the following warning.

Oct 30, 2011 8:12:22 PM com.mongodb.ReplicaSetStatus$Node update
WARNING: Server seen down: localhost:27019
java.net.SocketException: Connection reset by peer: socket write error
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at org.bson.io.PoolOutputBuffer.pipe(PoolOutputBuffer.java:129)
        at com.mongodb.OutMessage.pipe(OutMessage.java:160)
        at com.mongodb.DBPort.go(DBPort.java:108)
        at com.mongodb.DBPort.go(DBPort.java:82)
        at com.mongodb.DBPort.findOne(DBPort.java:142)
        at com.mongodb.DBPort.runCommand(DBPort.java:151)
        at com.mongodb.ReplicaSetStatus$Node.update(ReplicaSetStatus.java:178)
        at com.mongodb.ReplicaSetStatus.updateAll(ReplicaSetStatus.java:349)
        at com.mongodb.ReplicaSetStatus$Updater.run(ReplicaSetStatus.java:296)
Oct 30, 2011 8:12:22 PM com.mongodb.DBTCPConnector _set

Cassandra Java Annotations

August 30, 2010

Overview

Cassandra has a unique column-oriented data model which does not easily map to an entity-based Java model. Furthermore, the Java Thrift client implementation is very low-level and presents the developer with a rather difficult API to work with on a daily basis. This situation  is a good candidate for an adapter to shield the business code from mundane plumbing details.

I recently did some intensive Cassandra (version 0.6.5) work to load millions of geographical postions for ships at sea.  Locations were already being stored in MySQL/Innodb using JPA/Hibernate so I already had a ready-made model based on JPA entity beans. After some analysis, I created a mini-framework based on custom annotations and a substantial adapter to encapsulate all the “ugly” Thrift boiler-plate code.  Naturally everything was wired together with Spring.

Implementation

The very first step was to investigate existing Cassandra Java client toolkits. As usual in a startup environment time was at a premium, but I quickly checked out a few key clients. Firstly, I looked at Hector, but its API still exposed too much of the Thrift cruft for my needs. It did have nice features for failover and connection pooling, and I will definitely look at it in more detail in the future. Pelops looked really cool with its Mutators and Selectors, but it too dealt with columns – see the description.  What I was looking for was an object-oriented way to load and query Java beans. Note that this OO entity-like paradigm might not be applicable to other Cassandra data models, e.g. sparse matrices.

And then there was DataNucleus which advertises JPA/JDO implementations for a large variety of non-SQL persistence stores: LDAP, Hadoop Hbase, Google App, etc. There was mention of a Cassandra solution, but it wasn’t yet ready for prime time. How they manage to address the massive semantic mismatch between JPA is beyond me – unfortunately I didn’t have time to drill down. Seems fishy – but I’ll definitely check this out in the future. Even though I’m a big fan of using existing frameworks/tools, there are times when “rolling your own” is the best course of action.

The following collaborating classes comprised the  framework:

  • CassandraDao – High-level class that understands annotated entity beans
  • ColumnFamily – An adapter for common column family operations – hides the Thrift gore
  • AnnotationManager – Manages the annotated beans
  • TypeMapper – Maps Java data types into bytes and vice versa

Since we already had a JPA-annotated Location bean, my first thought was to reuse this class and simply process the the JPA annotations into their equivalent Cassandra concepts. Upon further examination this proved ugly – the semantic mismatch was too great. I certainly did not want to be importing JPA/Hibernate packages into a Cassandra application! Furthermore, many annotations (such as collections) were not applicable and I needed  annotations for Cassandra concepts that did not exist in JPA. In “set theoretic” terms, there are JPA-specific features, Cassandra-specific features and an intersection of the two.

The first-pass implementation required only three annotations: Entity, Column and Key. The Entity annotation is a class-level annotation with keyspace and columnFamily attributes. The Column annotation closely corresponded to its JPA equivalent. The Key annotation specifies the row key. The Entity defines the column family/keyspace  that the entity belongs to and its constituent columns. The CassandraDao class corresponds to a single column family and accepts an entity and type mapper.

Two column families were created: a column family for ship definitions, and a super column family for ship locations. The Ship CF was a simple collection of ship details keyed by each ship’s MMSI (a unique ID for a ship which is typically engraved on the keel).  The Location CF represented a one-to-many relationship for all the possible locations of a ship. The key was the ship’s MMSI, and the column names were Long types representing the millisecond timestamp for the location. The value of the column was a super column – it contained the columns as defined in the ShipLocation bean – latitude, longitude, course over ground, speed over ground, etc.  The number of location for a given ship could possibly range in the millions!

From an implementation perspective, I was rather surprised to find that there are no standard reusable classes to map basic Java data types to bytes. Sure, String has getBytes(), but I had to do some non-trivial distracting detective work to get doubles, longs, BigInteger, BigDecimal and Dates converted – all the shifting magic etc. Also made sure to run some performance tests to choose the best alternative!

CassandraDao

The DAO is based on the standard concept of  a genericized DAO of which many versions are floating around:

The initial version of the DAO with basic CRUD functionality is shown below:

public class CassandraDao<T> {
  public CassandraDao(Class<T> clazz, CassandraClient client, TypeMapper mapper)
  public T get(String key)
  public void insert(T entity)
  public T getSuperColumn(String key, byte[] superColumnName)
  public List<T> getSuperColumns(String key, List<byte[]> superColumnNames)
  public void insertSuperColumn(String key, T entity)
  public void insertSuperColumns(String key, List<T> entities)
 }

Of course more complex batch and range operations that reflect advanced Cassandra API methods are needed.

Usage Sample

  import com.google.common.collect.ImmutableList;
  import org.springframework.context.support.ClassPathXmlApplicationContext;
  import org.springframework.context.ApplicationContext;

  // initialization
  ApplicationContext context = new ClassPathXmlApplicationContext("config.xml");
  CassandraDao<Ship> shipDao = (CassandraDao<Ship>)context.getBean("shipDao");
  CassandraDao<ShipLocation> shipLocationDao =
    (CassandraDao<ShipLocation>)context.getBean("shipLocationDao");
  TypeMapper mapper = (DefaultTypeMapper)applicationContext.getBean("typeMapper");

  // get ship
  Ship ship = shipDao.get("1975");

  // insert ship
  Ship ship = new Ship();
  ship.setMmsi(1975); // note: row key - framework insert() converts to required String
  ship.setName("Hokulea");
  shipDao.insert(ship);

  // get ship location (super column)
  byte [] superColumn = typeMapper.toBytes(1283116367653L));
  ShipLocation location = shipLocationDao.getSuperColumn("1975",superColumn);

  // get ship locations (super column)
  ImmutableList<byte[]> superColumns = ImmutableList.of( // Until Java 7, Google rocks!
    typeMapper.toBytes(1283116367653L),
    typeMapper.toBytes(1283116913738L),
    typeMapper.toBytes(1283116977580L));
  List<ShipLocation> locations = shipLocationDao.getSuperColumns("1975",superColumns);

  // insert ship location (super column)
  ShipLocation location = new ShipLocation();
  location.setTimestamp(new Date());
  location.setLat(20);
  location.setLon(-90);
  shipLocationDao.insertSuperColumn("1775",location);

Java Entity Beans

Ship

@Entity( keyspace="Marine", columnFamily="Ship")
public class Ship {
  private Integer mmsi;
  private String name;
  private Integer length;
  private Integer width;

  @Key
  @Column(name = "mmsi")
  public Integer getMmsi() {return this.mmsi;}
  public void setMmsi(Integer mmsi) {this.mmsi= mmsi;}

  @Column(name = "name")
  public String getName() { return name; }
  public void setName(String name) { this.name = name; }
}

ShipLocation

@Entity( keyspace="Marine", columnFamily="ShipLocation")
public class ShipLocation {
  private Integer mmsi;
  private Date timestamp;
  private Double lat;
  private Double lon;

  @Key
  @Column(name = "mmsi")
  public Integer getMmsi() {return this.mmsi;}
  public void setMmsi(Integer mmsi) {this.mmsi= mmsi;}

  @Column(name = "timestamp")
  public Date getTimestamp() {return this.timestamp;}
  public void setTimestamp(Date timestamp) {this.timestamp = msgTimestamp;}

  @Column(name = "lat")
  public Double getLat() {return this.lat;}
  public void setLat(Double lat) {this.lat = lat;}

  @Column(name = "lon")
  public Double getLon() {return this.lon;}
  public void setLon(Double lon) {this.lon = lon;}
}

Spring Configuration

 <bean id="propertyConfigurer">
   <property name="systemPropertiesModeName" value="SYSTEM_PROPERTIES_MODE_OVERRIDE" />
   <property name="location" value="classpath:config.properties</value>
 </bean>
 <bean id="shipDao" class="com.andre.cassandra.dao.CassandraDao" scope="prototype" >
   <constructor-arg value="com.andre.cassandra.data.Ship" />
   <constructor-arg ref="cassandraClient" />
   <constructor-arg ref="typeMapper" />
 </bean>
 <bean id="shipLocationDao" scope="prototype" >
   <constructor-arg value="com.andre.cassandra.data.ShipLocation" />
   <constructor-arg ref="cassandraClient" />
   <constructor-arg ref="typeMapper" />
 </bean>

<bean id="cassandraClient" class="com.andre.cassandra.util.CassandraClient" scope="prototype" >
  <constructor-arg value="${cassandra.host}" />
  <constructor-arg value="${cassandra.port}" />
</bean>

<bean id="typeMapper" class="com.andre.cassandra.util.DefaultTypeMapper" scope="prototype" />

Annotation Documentation

Annotations

Annotation Class/Field Description
Entity Class Defines the keyspace and column family
Column Field Column name
Key Field Row key

Entity Attributes

Attribute Type Description
keyspace String Keyspace
columnFamily String Column Family

VTest Testing Framework

April 12, 2010

In order to test basic Voldemort API methods under specified realistic load scenarios, I leveraged the “VTest” framework that I had previously written for load testing. VTest is a light-weight Spring-based framework that separates the execution strategy from the business tasks and provides cross-cutting features such as statistics gathering and reporting.

The main features of VTest are:

  • Declarative workflow-based testing framework based on Spring
  • Separation of concerns: framework, executor, job, task, key and value generation strategies
  • Implementations of these concerns are all pluggable and configurable via Spring dependency injection and bean wiring
  • Framework handles cross-cutting concerns: error handling, call statistics, result reporting, and result persistence
  • Conceptually inspired by java.util.concurrent’s Executor
  • Executors: SequentialExecutor, FixedThreadPoolExecutor, ScheduledThreadPoolExecutor
  • Executor invokes a job or an individual task
  • A task is a unit of work – a job is a collection of tasks
  • Tasks are implemented as Java classes
  • Jobs are specified as lists of tasks in Spring XML configuration file
  • VTest configuration acts as a high-level testing DSL (Domain Specific Language)

Sample Result

Here is a sample output for a CRUD job that puts one million key/value pairs, gets them, updates them and finally deletes them. Each task is executed for N requests – N being one million – with a thread pool of 200. The pool acts as a Leaky Bucket (thanks to Joe for this handy reference). The job is executed for five iterations and both the details of each individual run and the aggregated result are displayed.

Description of columns:

  • Req/Sec – requests per second or throughput
  • Ratio – the fraction of total time for the task. The ratio is an inverse of the throughput – the higher the ratio, the lower the throughput.
  • The five % columns represent standard latency percentiles. For example, in the first PutCreate 99-th percentile means that 99% of the requests were 384 milliseconds or less.
  • Max – maximum latency. It is instructive to see that for large request sets, the 99.9 percentile doesn’t accurately portray the slowest requests. Notice that for the first PutCreate the Max is over five seconds whereas the 99.9 percentile is only 610 milliseconds. There’s a lot going on this in 0.01 % of requests! In fact Vogels makes a point that  Amazon doesn’t focus so much on averages but on reducing these exterme “outliers”.
  • Errors – number of exceptions thrown by the server. There is an example in the third PutUpdate.
  • Fails – number of failures. A failure is when the server does not throw an exception but the business logic deems the result incorrect. For example, if the retrieved value does not match its expected value, a failure is noted. Observe that there are 29,832 failures for the third Get – a rather worrisome occurrence.
  • StdDev – standard deviation
==== DETAIL STATUS ============ 

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
PutCreate       9921      7     29    384    454    610   5022       0      0   61.31
PutCreate       9790      7     31    358    427    516    707       0      0   55.23
PutCreate       8727      7     32    398    457    558    980       0      0   63.98
PutCreate      14354      7     26    122    213    375    613       0      0   27.51
PutCreate       8862      7     31    402    461    577    876       0      0   63.65
Total           9639      7     30    376    442    547   5022       0      0   58.03  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
Get            24364      6     10     78     88    114    440       0      0   11.35
Get            23568      6     11     81     89    159    320       0      0   12.31
Get            22769      7     11     81     89    109    381       0  28932   11.93
Get            23174      7     10     80     87     99    372       0      0   11.78
Get            22919      7     10     80     89    216    369       0      0   13.33
Total          23264      7     10     80     88    110    440       0  28932   12.15  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
PutUpdate       6555     11     32    554    943   1115   2272       0      0  101.49
PutUpdate       6412     11     32    574    900   1083   2040       0      0  101.99
PutUpdate       2945      3     10   4007   4009   4020   6010       1      0  494.14
PutUpdate       6365     11     35    537    746   1101   2118       0      0   97.55
PutUpdate       6634     11     32    537    853   1095   1293       0      0   98.18
Total           5668     10     31    554    978   4008   6010       1      0  197.87  

Count  Exception
1      class voldemort.store.InsufficientSuccessfulNodesException  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
Delete          6888     17     46    266    342    442    860       0      0   44.37
Delete          7649     17     43    176    263    395    619       0      0   34.11
Delete          8156     17     43    133    153    244    423       0   8544   25.03
Delete          7539     17     44    180    276    447    759       0      0   36.53
Delete          7457     17     43    218    285    420    714       0      0   38.02
Total           7494     17     44    203    280    410    860       0   8544   36.44  

=== SUMMARY STATUS ============
Test         Req/Sec  Ratio    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
DeleteTable   307456  0.01       0      0      0      0      0      2       0      0    0.01
StoreCreate     9639  0.23       7     30    376    442    547   5022       0      0   58.03
Retrieve       23264  0.09       7     10     80     88    110    440       0  28932   12.15
StoreUpdate     5668  0.38      10     31    554    978   4008   6010       1      0  197.87
Delete          7494  0.29      17     44    203    280    410    860       0   8544   36.44
Total                                                                       1  37476         

Count  Exception
1      voldemort.store.InsufficientSuccessfulNodesException  

Config Parameters:
  requests           : 1000000
  threadPoolSize     : 200
  valueSize          : 1000

Sample Chart

Since call statistics are persisted in a structured XML file, the results can be post-processed and charts can be generated. The example below compares the throughput for four different record sizes: 1k, 2k, 3k and 5k. It is implemented using the popular open-source JFreeChart package .

VTest Job Configuration File

The jobs and tasks are defined and configured in a standard Spring configuration file. For ease-of-use, the dynamically varying properties are externalized in the vtest.properties file.

<beans>
  <bean id="propertyConfigurer"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
    <property name="locations" value="classpath:vtest.properties" />
    <property name="systemPropertiesMode" value="2" />
  </bean>

<!-- ** Jobs/Tasks ************************ -->

  <util:list id="crud.job"  >
    <ref bean="putCreate.task" />
    <ref bean="get.task" />
    <ref bean="putUpdate.task" />
    <ref bean="delete.task" />
  </util:list>

  <bean id="putCreate.task" class="com.amm.vtest.tasks.voldemort.PutTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
    <constructor-arg value="PutCreate" />
  </bean>

  <bean id="putUpdate.task" class="com.amm.vtest.tasks.voldemort.PutTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
    <constructor-arg value="PutUpdate" />
  </bean>

  <bean id="get.task" class="com.amm.vtest.tasks.voldemort.GetTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
  </bean>

  <bean id="delete.task" class="com.amm.vtest.tasks.voldemort.DeleteTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
  </bean>

  <bean id="taskConfig" class="com.amm.vtest.tasks.voldemort.VoldemortTaskConfig" scope="prototype" >
    <constructor-arg value="${cfg.store}" />
    <constructor-arg value="${cfg.urls}" />
    <constructor-arg value="${cfg.clientConfigFile}" />
    <property name="valueSize"      value="${cfg.valueSize}" />
    <property name="valueGenerator" ref="valueGenerator" />
    <property name="keyGenerator"   ref="keyGenerator" />
    <property name="checkValue"     value="${cfg.checkRetrieveValue}" />
  </bean>

<!-- ** VTest **************** -->

  <bean id="vtestProcessor"
        class="com.amm.vtest.VTestProcessor" scope="prototype">
    <constructor-arg ref="executor" />
    <constructor-arg ref="callStatsReporter" />
    <property name="warmup"          value="${cfg.warmup}" />
    <property name="logDetails"      value="true" />
    <property name="logDetailsAsXml" value="true" />
  </bean>

  <bean id="callStatsReporter"
        class="com.amm.vtest.services.callstats.CallStatsReporter" scope="prototype">
    <property name="properties" ref="configProperties" />
  </bean>

  <util:map id="configProperties">
    <entry key="requests" value="${cfg.requests}" />
    <entry key="threadPoolSize" value="${cfg.threadPoolSize}" />
    <entry key="valueSize" value="${cfg.valueSize}" />
  </util:map >

<!-- ** Executors **************** -->

  <alias alias="executor" name="fixedThreadPool.executor" />

  <bean id="sequential.executor"
        class="com.amm.vtest.SequentialExecutor" scope="prototype">
    <property name="numRequests" value="${cfg.requests}" />
  </bean>

  <bean id="fixedThreadPool.executor"
        class="com.amm.vtest.FixedThreadPoolExecutor" scope="prototype">
    <property name="numRequests"     value="${cfg.requests}" />
    <property name="threadPoolSize"  value="${cfg.threadPoolSize}" />
    <property name="logModulo"       value="${cfg.logModulo}" />
  </bean>

</beans>

VTest Properties

cfg.urls=tcp://10.22.48.50:6666,tcp://10.22.48.51:6666,tcp://10.22.48.52:6666
cfg.store=test_mysql
cfg.requests=1000000
cfg.valueSize=1000
cfg.threadPoolSize=200
cfg.clientConfigFile=client.properties
cfg.checkRetrieveValue=false
cfg.warmup=false
cfg.logModulo=1000
cfg.fixedKeyGenerator.size=36
cfg.fixedKeyGenerator.reset=true

Run Script

. common.env

CPATH="$CPATH;config"
PGM=com.amm.vtest.VTestDriver
STORE=test_bdb
CONFIG=vtest.xml

job=crud.job
iterations=1
requests=1000000
threadPoolSize=200
valueSize=1000

opts="r:t:v:i:"
while getopts $opts opt
  do
  case $opt in
    r) requests=$OPTARG ;;
    t) threadPoolSize=$OPTARG ;;
    v) valueSize=$OPTARG ;;
    i) iterations=$OPTARG ;;
    \?) echo $USAGE " Error"
        exit;;
    esac
  done
shift `expr $OPTIND - 1`
if [ $# -gt 0 ] ; then
  job=$1
  fi

tstamp=`date "+%F_%H-%M"` ; logdir=logs-$job-$tstamp ; mkdir $logdir

PROPS=
PROPS="$PROPS -Dcfg.requests=$requests"
PROPS="$PROPS -Dcfg.threadPoolSize=$threadPoolSize"
PROPS="$PROPS -Dcfg.valueSize=$valueSize"

time -p java $PROPS -cp $CPATH $PGM $* \
  --config $CONFIG --iterations $iterations --job $job \
  | tee log.txt

cp -p log.txt log-*.xml times-*.txt *.log $logdir

XML Logging Output

The call statistics for each task run are stored in an XML files for future reference and possible post-processing, e.g. charts, database persistences, cross-run aggregation. JAXB and a XSD schema are used to process the XML.

<callStats>
    <taskName>task-Put</taskName>
    <date>2010-04-04T21:50:21.459-04:00</date>
    <callsPerSecond>13215.975471149524</callsPerSecond>
    <elapsedTime>75666</elapsedTime>
    <standardDeviation>27.547028113708425</standardDeviation>
    <callRatio>0.34021105261028106</callRatio>
    <calls failures="0" errors="0" all="1000000"/>
    <percentiles>
        <percentile50>7.0</percentile50>
        <percentile90>31.0</percentile90>
        <percentile99>124.0</percentile99>
        <percentile995>163.0</percentile995>
        <percentile999>269.0</percentile999>
    </percentiles>
</callStats>