hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PerformanceEvaluation" by stack
Date Sat, 17 Jan 2009 22:53:19 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:

  Start cluster fresh for each test then wait for all regions to be deployed before starting
up tests (means no content in memcache which means that for such as random read we are always
going to the filesystem, never getting values from memcache).
- ||<rowbgcolor="#ececec">Experiment Run||0.2.0java6||mapfile0.17.1||0.19.0RC1!Java6||0.19.0RC1!Java6!Zlib||mapfile0.19.0||!BigTable||
+ ||<rowbgcolor="#ececec">Experiment Run||0.2.0java6||mapfile0.17.1||0.19.0RC1!Java6||0.19.0RC1!Java6!Zlib||0.19.0RC1!Java6,8Clients||mapfile0.19.0||!BigTable||
- ||random reads ||428||568||540||80||768||1212||
+ ||random reads ||428||568||540||80||768||768||1212||
- ||random reads (mem)||-||-||-||-||-||10811||
+ ||random reads (mem)||-||-||-||-||-||-||10811||
- ||random writes||2167||2218||9986||-||-||8850||
+ ||random writes||2167||2218||9986||-||-||-||8850||
- ||sequential reads||427||582||464||-||-||4425||
+ ||sequential reads||427||582||464||-||-||-||4425||
- ||sequential writes||2076||5684||9892||7182||7519||8547||
+ ||sequential writes||2076||5684||9892||7182||14027||7519||8547||
- ||scans||3737||55692||20971||20560||55555||15385||
+ ||scans||3737||55692||20971||20560||14742||55555||15385||
  Some improvement writing and scanning (faster than BigTable paper seemingly).  Random Reads
still lag.  Sequential Reads lag badly.  A bit of fetch-ahead as we did scanning should help
@@ -197, +197 @@

  Of note, the mapfile numbers are less than those of hbase when writing because the mapfile
tests write one file whereas hbase after first split is writing to multiple files concurrently.
 On the other hand, hbase random read is very like mapfile random read, at least in single
client case; we're effectively asking the filesystem for a random value from the midst of
a file in both cases.  The mapfile numbers are useful as guage of how much hdfs has come on
since the last time we ran PE.
- Block compression (zlib -- hbase bug won't let you specify lzo) is a little slower writing,
way slower random-reading but about same scanning.
+ Block compression (native zlib -- hbase bug won't let you specify anything but the DefaultCodec,
e.g. lzo) is a little slower writing, way slower random-reading but about same scanning.
- Will post a new state, 8 concurrent clients, in a while so we can start tracking how we
are doing when contending clients.
+ The 8 concurrent clients write a single regionserver instance.  Our cluster is four computers.
 Load was put up by running a MR job as follows: {{{$ ./bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation
randomRead  8}}} MR job ran two mappers per computer so 8 clients running concurrently.  Timings
were those reported at head of the MR job page in the UI.

View raw message