accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Presplitting tables for the YCSB workloads
Date Fri, 18 Sep 2015 13:07:18 GMT

I don't have a script for you, but if you need to create one you could use the script command
in the shell to do something similar to the hbase script. Some examples are in the comments
in jira[1]. If you can figure out how you want the table split, and it can be scripted, I
might have time this weekend to write it up for you. 


----- Original Message -----

From: "Sean Busbey" <> 
To: "Accumulo User List" <> 
Sent: Thursday, September 17, 2015 10:10:29 PM 
Subject: Presplitting tables for the YCSB workloads 

YCSB is gearing up for its next monthly release, and I really want to 
add in an Accumulo specific README for running workloads. 

This is generally so that folks have an easier time running tests 
themselves. It's also because I keep testing Accumulo for the YCSB 
releases and coupled with a README file we'd get an Accumulo-specific 
convenience binary. Avoiding the bulk of dependencies that get 
included in the generic YCSB distribution artifact is a big win. 

The thing I keep getting hung up on is remembering how to properly 
split the Accumulo table for YCSB workloads. The HBase README has a 
great hbase shell snippet for doing this (because users can copy/paste 

3. Create a HBase table for testing 

For best results, use the pre-splitting strategy recommended in HBASE-4163: 

hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of 
hbase(main):002:0> create 'usertable', 'family', {SPLITS => 
(1..n_splits).map {|i| "user#{1000+i*(9999-1000)/n_splits}"}} 

Failing to do so will cause all writes to initially target a single 
region server. 

Anyone have a work up of an equivalent for Accumulo that I can include 
under an ASLv2 license? I seem to recall madrob had something done in 
a bash script, but I can't find it anywhere. 



View raw message