hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Hbase/PerformanceEvaluation" by stack
Date Fri, 08 Jun 2007 17:57:55 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by stack:

The comment on the change is:
First cut at description of the performance evaluation scripts

New page:
= Testing HBase Performance and Scalability =

[https://issues.apache.org/jira/browse/HADOOP-1476 HADOOP-1476] adds to HBase {{{src/test}}}
the script {{{org.apache.hadoop.hbase.PerformanceEvaluation}}}.  It runs the tests described
in ''Performance Evaluation'', Section 7 of the [http://labs.google.com/papers/bigtable.html
BigTable paper].  See the citation for test descriptions.  They will not be described below.
The script is useful evaluating HBase performance and how well it scales as we add region

Here is the current usage for the {{{PerformanceEvaluation}}} script:

[stack@aa0-000-12 ~]$ ./hadoop-trunk/src/contrib/hbase/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation
Usage: java org.apache.hadoop.hbase.PerformanceEvaluation[--master=host:port] [--miniCluster]
<command> <nclients>

 master          Specify host and port of HBase cluster master. If not present,
                 address is read from configuration
 miniCluster     Run the test on an HBaseMiniCluster

 randomRead      Run random read test
 randomReadMem   Run random read test where table is in memory
 randomWrite     Run random write test
 sequentialRead  Run sequential read test
 sequentialWrite Run sequential write test
 scan            Run scan test

 nclients        Integer. Required. Total number of clients (and HRegionServers)
                 running: 1 <= value <= 500
 To run a single evaluation client:
 $ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1

If you pass nclients > 1, {{{PerformanceEvaluation}}} starts up a mapreduce job in which
each map runs a single loading client instance.

To run the {{{PerformanceEvaluation}}} script, compile the HBase test classes:

$ cd ${HBASE_HOME}
$ ant compile-test

The above ant target compiles all test classes into {{{${HADOOP_HOME}/build/contrib/hbase/test}}}.
 It also generates {{{${HADOOP_HOME}/build/contrib/hbase/hadoop-hbase-test.jar}}}.  The latter
jar includes all HBase test and src classes and has {{{org.apache.hadoop.hbase.PerformanceEvaluation}}}
as its {{{Main-Class}}}.  Use the test jar running {{{PerformanceEvaluation}}} on a hadoop

Here is how to run a single-client {{{PerformanceEvaluation}}} ''sequentialWrite'' test:

{{{$ ${HADOOP_HOME}/src/contrib/hbase/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation
sequentialWrite 1

Here is how you would run the same on hadoop cluster:

{{{$ ${HADOOP_HOME}/bin/hadoop jar ${HADOOP_HOME}/build/contrib/hbase/hadoop-hbase-test.jar
sequentialWrite 1

For the latter, you will likely have to copy your hbase configurations -- e.g. your {{{${HBASE_HOME}/conf/hbase*.xml}}}
files -- to {{{${HADOOP_HOME}/conf}}} and make sure they are replicated across the cluster
so your hbase configurations can be found by the running mapreduce job (in particular, clients
need to know the address of the HBase master).

Note, the mapreduce mode of the testing script works a little different from single client
mode.  It does not delete the test table at the end of each run as is done when the script
runs in single client mode.  Nor does it pre-run the '''sequentialWrite''' test before its
runs the '''sequentialRead''' test (the table needs to be populated with data first before
the sequentialRead can run).  For the mapreduce version, the onus is on the operator to organize
the correct order in which to run the jobs.  To delete a table, use the hbase client.

{{{$ ${HBASE_HOME}/bin/hbase ciient listTables
$ ${HBASE_HOME}/bin/hbase ciient deleteTable TestTable

Some first figures in advance of any profiling of the current state of the HBase code (on
Fri Jun 8 2007) would seem to indicate that HBase runs at about an order-of-magnitude slower
than whats reported in the BigTable paper running on similiar hardware (more on this to follow).

View raw message