hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suman Srinivasan <su...@longtailvideo.com>
Subject HBase write performance benchmarking script
Date Mon, 16 Jul 2012 15:44:13 GMT
Hi all,

I couldn't find anything like this, so I've put together what I hope is a fairly simple but
comprehensive test script to evaluate write performance on a HBase cluster that is running

This is written in Python, and requires the installation of HappyBase (sudo easy_install happybase)
and a running HBase Thrift interface (hbase-daemon.sh start thrift) to the cluster.

This script is meant to test the write performance of a HBase cluster according to various
parameters; it writes random row keys and you can fine-tune the following parameters:
1. Number of write "threads" (actually processes) to run in parallel
2. Number of puts that are batched together (make this 1 to remove batch puts and test raw
single-put operations)
3. Total number of rows written to the cluster
4. Specify multiple Thrift servers for the cluster (if you have more than one Thrift server)
5. Row key: by modifying line #34 that generates the random row key, you can make the row-key
closer to your application needs

I'm fairly new to the HBase world, so if there are any major mistakes to this, feel free to
share feedback or fork the code on GitHub and improve on it.

Thank you,

Suman Srinivasan

JW Player | Bits on the Run | LongTail.tv

View raw message