hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/PerformanceEvaluation" by stack
Date Sat, 17 Jan 2009 05:16:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation

The comment on the change is:
Numbers for mapfile

------------------------------------------------------------------------------
  
  Also includes numbers for hadoop mapfile.  Table includes last test, 0.2.0java6 (and hadoop
0.17.2) from above for easy comparison.
  
- Start cluster fresh for each test then wait for all regions to be deployed before starting
up tests.  Speedup is combo of hdfs improvements, hbase improvements including batching when
writing and scanning (the bigtable PE description alludes to scans using prefetch), and use
of two JBOD'd disks -- as in google paper -- where previous in tests above, all disks were
RAID'd. Otherwise, hardware is same, similar to bigtable papers's dual dual-core opterons,
1G for hbase, etc.
+ Start cluster fresh for each test then wait for all regions to be deployed before starting
up tests (means no content in memcache which means that for such as random read we are always
going to the filesystem, never getting values from memcache).
  
  ||<rowbgcolor="#ececec">Experiment Run||0.2.0java6||mapfile0.17.1||0.19.0RC1!Java6||mapfile0.19.0||!BigTable||
- ||random reads ||428||568||540||-||1212||
+ ||random reads ||428||568||540||768||1212||
  ||random reads (mem)||-||-||-||-||10811||
  ||random writes||2167||2218||9986||-||8850||
  ||sequential reads||427||582||464||-||4425||
- ||sequential writes||2076||5684||9892||-||8547||
+ ||sequential writes||2076||5684||9892||7519||8547||
  ||scans||3737||55692||20971||-||15385||
  
  Some improvement writing and scanning (faster than BigTable paper seemingly).  Random Reads
still lag.  Sequential Reads lag badly.  A bit of fetch-ahead as we did scanning should help
here.
  
+ Speedup is combo of hdfs improvements, hbase improvements including batching when writing
and scanning (the bigtable PE description alludes to scans using prefetch), and use of two
JBOD'd disks -- as in google paper -- where previous in tests above, all disks were RAID'd.
Otherwise, hardware is same, similar to bigtable papers's dual dual-core opterons, 1G for
hbase, etc.
+ 
+ Of note, the mapfile numbers are less than those of hbase when writing because the mapfile
tests write one file whereas hbase after first split is writing to multiple files concurrently.
 On the other hand, hbase random read is very like mapfile random read, at least in single
client case; we're effectively asking the filesystem for a random value from the midst of
a file in both cases.  The mapfile numbers are useful as guage of how much hdfs has come on
since the last time we ran PE.
+ 
  Will post a new state, 8 concurrent clients, in a while so we can start tracking how we
are doing when contending clients.
  

Mime
View raw message