Archive for April, 2010

VTest Testing Framework

April 12, 2010

In order to test basic Voldemort API methods under specified realistic load scenarios, I leveraged the “VTest” framework that I had previously written for load testing. VTest is a light-weight Spring-based framework that separates the execution strategy from the business tasks and provides cross-cutting features such as statistics gathering and reporting.

The main features of VTest are:

  • Declarative workflow-based testing framework based on Spring
  • Separation of concerns: framework, executor, job, task, key and value generation strategies
  • Implementations of these concerns are all pluggable and configurable via Spring dependency injection and bean wiring
  • Framework handles cross-cutting concerns: error handling, call statistics, result reporting, and result persistence
  • Conceptually inspired by java.util.concurrent’s Executor
  • Executors: SequentialExecutor, FixedThreadPoolExecutor, ScheduledThreadPoolExecutor
  • Executor invokes a job or an individual task
  • A task is a unit of work – a job is a collection of tasks
  • Tasks are implemented as Java classes
  • Jobs are specified as lists of tasks in Spring XML configuration file
  • VTest configuration acts as a high-level testing DSL (Domain Specific Language)

Sample Result

Here is a sample output for a CRUD job that puts one million key/value pairs, gets them, updates them and finally deletes them. Each task is executed for N requests – N being one million – with a thread pool of 200. The pool acts as a Leaky Bucket (thanks to Joe for this handy reference). The job is executed for five iterations and both the details of each individual run and the aggregated result are displayed.

Description of columns:

  • Req/Sec – requests per second or throughput
  • Ratio – the fraction of total time for the task. The ratio is an inverse of the throughput – the higher the ratio, the lower the throughput.
  • The five % columns represent standard latency percentiles. For example, in the first PutCreate 99-th percentile means that 99% of the requests were 384 milliseconds or less.
  • Max – maximum latency. It is instructive to see that for large request sets, the 99.9 percentile doesn’t accurately portray the slowest requests. Notice that for the first PutCreate the Max is over five seconds whereas the 99.9 percentile is only 610 milliseconds. There’s a lot going on this in 0.01 % of requests! In fact Vogels makes a point that  Amazon doesn’t focus so much on averages but on reducing these exterme “outliers”.
  • Errors – number of exceptions thrown by the server. There is an example in the third PutUpdate.
  • Fails – number of failures. A failure is when the server does not throw an exception but the business logic deems the result incorrect. For example, if the retrieved value does not match its expected value, a failure is noted. Observe that there are 29,832 failures for the third Get – a rather worrisome occurrence.
  • StdDev – standard deviation
==== DETAIL STATUS ============ 

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
PutCreate       9921      7     29    384    454    610   5022       0      0   61.31
PutCreate       9790      7     31    358    427    516    707       0      0   55.23
PutCreate       8727      7     32    398    457    558    980       0      0   63.98
PutCreate      14354      7     26    122    213    375    613       0      0   27.51
PutCreate       8862      7     31    402    461    577    876       0      0   63.65
Total           9639      7     30    376    442    547   5022       0      0   58.03  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
Get            24364      6     10     78     88    114    440       0      0   11.35
Get            23568      6     11     81     89    159    320       0      0   12.31
Get            22769      7     11     81     89    109    381       0  28932   11.93
Get            23174      7     10     80     87     99    372       0      0   11.78
Get            22919      7     10     80     89    216    369       0      0   13.33
Total          23264      7     10     80     88    110    440       0  28932   12.15  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
PutUpdate       6555     11     32    554    943   1115   2272       0      0  101.49
PutUpdate       6412     11     32    574    900   1083   2040       0      0  101.99
PutUpdate       2945      3     10   4007   4009   4020   6010       1      0  494.14
PutUpdate       6365     11     35    537    746   1101   2118       0      0   97.55
PutUpdate       6634     11     32    537    853   1095   1293       0      0   98.18
Total           5668     10     31    554    978   4008   6010       1      0  197.87  

Count  Exception
1      class voldemort.store.InsufficientSuccessfulNodesException  

Test         Req/Sec    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
Delete          6888     17     46    266    342    442    860       0      0   44.37
Delete          7649     17     43    176    263    395    619       0      0   34.11
Delete          8156     17     43    133    153    244    423       0   8544   25.03
Delete          7539     17     44    180    276    447    759       0      0   36.53
Delete          7457     17     43    218    285    420    714       0      0   38.02
Total           7494     17     44    203    280    410    860       0   8544   36.44  

=== SUMMARY STATUS ============
Test         Req/Sec  Ratio    50%    90%    99%  99.5%  99.9%    Max  Errors  Fails  StdDev
DeleteTable   307456  0.01       0      0      0      0      0      2       0      0    0.01
StoreCreate     9639  0.23       7     30    376    442    547   5022       0      0   58.03
Retrieve       23264  0.09       7     10     80     88    110    440       0  28932   12.15
StoreUpdate     5668  0.38      10     31    554    978   4008   6010       1      0  197.87
Delete          7494  0.29      17     44    203    280    410    860       0   8544   36.44
Total                                                                       1  37476         

Count  Exception
1      voldemort.store.InsufficientSuccessfulNodesException  

Config Parameters:
  requests           : 1000000
  threadPoolSize     : 200
  valueSize          : 1000

Sample Chart

Since call statistics are persisted in a structured XML file, the results can be post-processed and charts can be generated. The example below compares the throughput for four different record sizes: 1k, 2k, 3k and 5k. It is implemented using the popular open-source JFreeChart package .

VTest Job Configuration File

The jobs and tasks are defined and configured in a standard Spring configuration file. For ease-of-use, the dynamically varying properties are externalized in the vtest.properties file.

<beans>
  <bean id="propertyConfigurer"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
    <property name="locations" value="classpath:vtest.properties" />
    <property name="systemPropertiesMode" value="2" />
  </bean>

<!-- ** Jobs/Tasks ************************ -->

  <util:list id="crud.job"  >
    <ref bean="putCreate.task" />
    <ref bean="get.task" />
    <ref bean="putUpdate.task" />
    <ref bean="delete.task" />
  </util:list>

  <bean id="putCreate.task" class="com.amm.vtest.tasks.voldemort.PutTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
    <constructor-arg value="PutCreate" />
  </bean>

  <bean id="putUpdate.task" class="com.amm.vtest.tasks.voldemort.PutTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
    <constructor-arg value="PutUpdate" />
  </bean>

  <bean id="get.task" class="com.amm.vtest.tasks.voldemort.GetTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
  </bean>

  <bean id="delete.task" class="com.amm.vtest.tasks.voldemort.DeleteTask" scope="prototype" >
    <constructor-arg ref="taskConfig" />
  </bean>

  <bean id="taskConfig" class="com.amm.vtest.tasks.voldemort.VoldemortTaskConfig" scope="prototype" >
    <constructor-arg value="${cfg.store}" />
    <constructor-arg value="${cfg.urls}" />
    <constructor-arg value="${cfg.clientConfigFile}" />
    <property name="valueSize"      value="${cfg.valueSize}" />
    <property name="valueGenerator" ref="valueGenerator" />
    <property name="keyGenerator"   ref="keyGenerator" />
    <property name="checkValue"     value="${cfg.checkRetrieveValue}" />
  </bean>

<!-- ** VTest **************** -->

  <bean id="vtestProcessor"
        class="com.amm.vtest.VTestProcessor" scope="prototype">
    <constructor-arg ref="executor" />
    <constructor-arg ref="callStatsReporter" />
    <property name="warmup"          value="${cfg.warmup}" />
    <property name="logDetails"      value="true" />
    <property name="logDetailsAsXml" value="true" />
  </bean>

  <bean id="callStatsReporter"
        class="com.amm.vtest.services.callstats.CallStatsReporter" scope="prototype">
    <property name="properties" ref="configProperties" />
  </bean>

  <util:map id="configProperties">
    <entry key="requests" value="${cfg.requests}" />
    <entry key="threadPoolSize" value="${cfg.threadPoolSize}" />
    <entry key="valueSize" value="${cfg.valueSize}" />
  </util:map >

<!-- ** Executors **************** -->

  <alias alias="executor" name="fixedThreadPool.executor" />

  <bean id="sequential.executor"
        class="com.amm.vtest.SequentialExecutor" scope="prototype">
    <property name="numRequests" value="${cfg.requests}" />
  </bean>

  <bean id="fixedThreadPool.executor"
        class="com.amm.vtest.FixedThreadPoolExecutor" scope="prototype">
    <property name="numRequests"     value="${cfg.requests}" />
    <property name="threadPoolSize"  value="${cfg.threadPoolSize}" />
    <property name="logModulo"       value="${cfg.logModulo}" />
  </bean>

</beans>

VTest Properties

cfg.urls=tcp://10.22.48.50:6666,tcp://10.22.48.51:6666,tcp://10.22.48.52:6666
cfg.store=test_mysql
cfg.requests=1000000
cfg.valueSize=1000
cfg.threadPoolSize=200
cfg.clientConfigFile=client.properties
cfg.checkRetrieveValue=false
cfg.warmup=false
cfg.logModulo=1000
cfg.fixedKeyGenerator.size=36
cfg.fixedKeyGenerator.reset=true

Run Script

. common.env

CPATH="$CPATH;config"
PGM=com.amm.vtest.VTestDriver
STORE=test_bdb
CONFIG=vtest.xml

job=crud.job
iterations=1
requests=1000000
threadPoolSize=200
valueSize=1000

opts="r:t:v:i:"
while getopts $opts opt
  do
  case $opt in
    r) requests=$OPTARG ;;
    t) threadPoolSize=$OPTARG ;;
    v) valueSize=$OPTARG ;;
    i) iterations=$OPTARG ;;
    \?) echo $USAGE " Error"
        exit;;
    esac
  done
shift `expr $OPTIND - 1`
if [ $# -gt 0 ] ; then
  job=$1
  fi

tstamp=`date "+%F_%H-%M"` ; logdir=logs-$job-$tstamp ; mkdir $logdir

PROPS=
PROPS="$PROPS -Dcfg.requests=$requests"
PROPS="$PROPS -Dcfg.threadPoolSize=$threadPoolSize"
PROPS="$PROPS -Dcfg.valueSize=$valueSize"

time -p java $PROPS -cp $CPATH $PGM $* \
  --config $CONFIG --iterations $iterations --job $job \
  | tee log.txt

cp -p log.txt log-*.xml times-*.txt *.log $logdir

XML Logging Output

The call statistics for each task run are stored in an XML files for future reference and possible post-processing, e.g. charts, database persistences, cross-run aggregation. JAXB and a XSD schema are used to process the XML.

<callStats>
    <taskName>task-Put</taskName>
    <date>2010-04-04T21:50:21.459-04:00</date>
    <callsPerSecond>13215.975471149524</callsPerSecond>
    <elapsedTime>75666</elapsedTime>
    <standardDeviation>27.547028113708425</standardDeviation>
    <callRatio>0.34021105261028106</callRatio>
    <calls failures="0" errors="0" all="1000000"/>
    <percentiles>
        <percentile50>7.0</percentile50>
        <percentile90>31.0</percentile90>
        <percentile99>124.0</percentile99>
        <percentile995>163.0</percentile995>
        <percentile999>269.0</percentile999>
    </percentiles>
</callStats>
Advertisements

Eventual Consistency Testing

April 12, 2010

I’ve been recently involved in testing a massively scalable application based on an eventual consistency  framework called Voldemort.

The key articles on Dynamo and eventual consistency are:

Dynamo has inspired a variety of frameworks based on distributed hash table principles such as Cassandra, Voldemort, Mongo etc. What all these tools strive to address is the inherent limit to massive scalability with traditional relational databases. Hence the name “No SQL”.

How is Dynamo tested?

All this sounds fine, but the real question is: does this work in real life? Unfortunately Amazon has not exposed the Dynamo source code, and except that it is written in Java, little is known. As a pragmatic sort of fellow, I am always keen on knowing the nuts and bolts of new-fangled solutions. How does Amazon certify builds? What is Amazon’s test framework for such a massively scalable framework such as Dynamo? What sort of tests do they have? How do they specify and test their SLAs? How do they test the intricate and complex logic associated with quorum-based logic as cluster nodes are brought up and down? I could well imagine that the complexity of such a test environment exceeding the complexity of the application itself.

Embedded and Standalone Execution Contexts

One of the nice things about the Voldemort project is its strong emphasis on modularity and mockable objects. The Voldemort server has the capability of being launched in an embedded mode, and this greatly facilitates many testing scenarios. However, this in no way replaces the need to test against an actual standalone server. Embedded vs. standalone testing is a false dilemma. The vast majority of test cases can and should be run in both modes. Embedded for ease of use, but standalone for truer validation since it more closely approximates the target production environment. So the first step was to create an “Execution Context” object that encapsulated the different bootstrapping logic.

InitContext Interface.

public interface InitContext {
  public void start() throws IOException ;
  public void stop() throws IOException ;
  public VoldemortStoreDao getTestStoreDao() ;
}

EmptyContext for standalone server. Nothing much needs to be done since server is launched externally.

public class EmptyInitContext implements InitContext
{
  private VoldemortStoreDao storeDao ;

  public EmptyInitContext(VoldemortStoreDao storeDao) {
    this.storeDao = storeDao ;
  }

  public void start() throws IOException {
  }

  public void stop() throws IOException {
  }

  public VoldemortStoreDao getTestStoreDao() {
    return storeDao ;
  }
}

EmbeddedContext for an embedded Voldemort server that uses an embedded Berkeley DB store.

public class EmbeddedServerInitContext implements InitContext
{
  private VoldemortServer server ;
  private TestConfig testConfig ;
  private VoldemortStoreDao testDao ;
  private boolean useNio = false ;

  public EmbeddedServerInitContext(TestConfig testConfig) {
    this.testConfig = testConfig ;
  }

  public void start() throws IOException {
    String configDir = testConfig.getVoldemortConfigDir();
    String dataDir = testConfig.getVoldemortDataDir();
    int nodeId = 0 ;
    server = new VoldemortServer(
      ServerTestUtils.createServerConfig(useNio, nodeId, dataDir,
        configDir + "/cluster.xml", configDir + "/stores.xml",
        new Properties() ));
    server.start();

    StoreRepository srep = server.getStoreRepository();
    List stores = srep.getAllLocalStores() ;
    for (Store store : stores) {
      Store lstore = VoldemortTestUtils.getLeafStore(store);
      if (lstore instanceof StorageEngine) {
        if (store.getName().equals(testConfig.getStoreName())) {
          StorageEngine engine = (StorageEngine) lstore ;
          StorageEngineDaoImpl dao = new StorageEngineDaoImpl(engine);
          testDao = dao ;
          break;
          }
        }
      }
  }

  public void stop() throws IOException {
    ServerTestUtils.stopVoldemortServer(server);
  }

  public VoldemortStoreDao getTestStoreDao() {
    return testDao ;
  }
}

Server Cycling

One important place where embedded and standalone testing logic do differ is in server cycling. This is especially important when testing eventual consistency scenarios. Server cycling refers to the starting and stopping of server nodes. In embedded mode this is no problem since everything is executing inside one JVM process. When the servers are separate processes, the problem becomes significantly more difficult. Stopping a remote Voldemort server actually turns out to be easy since Voldemort exposes a JMX MBean with a stop operation. Needless to say this technique can not be used to start a server! In order to launch a server, the test client has to somehow invoke a script on a remote machine. The following steps need to done:

  • Use Java Runtime.exec to ssh a script on remote machine
  • Script must first check that a server is not running – if it is an error is returned
  • Script calls voldemort-server.sh
  • Script waits an indeterminate amount of time to allow the server to start
  • Script invokes “some operation” to ascertain that the server is ready to accept requests

As you can see each step is fraught with problems. In local embedded mode this series of complex steps is subsumed in the blocking call to simply create a new in-process object. In standalone mode, the wait step is problematic since there is no precise amount of time to wait. Wait and then do what to determine server liveliness? Invoke an operation? This would/could affect the integrity of the very operation we are testing! One potential solution is to invoke a JMX operation that would serve the purpose of a liveliness check. Assuming all goes well, all of this takes time and for a large battery of tests the overall execution time is significantly increased.

Eventual Consistency Test Example

Let us look at some examples. Assume we have a three node cluster with N=3,W=2,R=2. N is the number of nodes, W is the number of writes that must succeed and R is the required reads. For example, for a write operation the system will try to write the data to all nodes. If two (or three) succeed the operation is considered as succesful.

Get – One Node Down

  • Call put(K,V)
  • For each node in cluster
    • Stop node
    • Call V2=get(K)
    • Assert that V==V2
    • Start node

This logic needs to be executed against all thirteen Voldemort operations: get, put, putIfNotObsolete, delete, etc. Whew! Now imagine if requirements are to test against two different cluster configurations, a 3/2/2 and 5/3/3!