hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB Benchmark
Date Thu, 04 Aug 2011 00:45:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079153#comment-13079153
] 

Nicolas Spiegelberg commented on HBASE-4163:
--------------------------------------------

My initial thought is to use the existing RegionSplitter utility.  We just need to create
a custom SplitAlgorithm implementation class for the YCSB key specification & tell the
users to run:

{code}
bin/hbase org.apache.hadoop.hbase.util.RegionSplitter TABLE -c 200 -f FAMILY -D split.algorithm=YcsbSplit
{code}

to pre-create a table with 200 regions.  To not split, we can either set hbase.hregion.max.filesize
to a really high value or add a per-table split config option.

> Create Split Strategy for YCSB Benchmark
> ----------------------------------------
>
>                 Key: HBASE-4163
>                 URL: https://issues.apache.org/jira/browse/HBASE-4163
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.90.3, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Lars George
>            Priority: Minor
>              Labels: benchmark
>
> Talked with Lars about how we can make it easier for users to run the YCSB benchmarks
against HBase & get realistic results.  Currently, HBase is optimized for the random/uniform
read/write case, which is the YCSB load.  The initial reason why we perform bad when users
test against us is because they do not presplit regions & have the split ratio really
low.  We need a one-line way for a user to create a table that is pre-split to 200 regions
(or some decent number) by default & disable splitting.  Realistically, this is how a
uniform load cluster should scale, so it's not a hack.  This will also give us a good use
case to point to for how users should pre-split regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message