hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kumiko Yada <Kumiko.Y...@ds-iq.com>
Subject RE: Put performance test
Date Wed, 23 Dec 2015 17:47:26 GMT
I will try this.  Thanks.

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Tuesday, December 22, 2015 1:18 PM
To: user@hbase.apache.org
Subject: Re: Put performance test

Kumiko:
You can define your own YCSB workload by specifying the readproportion and scanproportion
you want.

FYI

On Tue, Dec 22, 2015 at 11:39 AM, iain wright <iainwrig@gmail.com> wrote:

> You could use YCSB and a custom workload (i don't see a predefined 
> workload for 100% puts without reads)
>
> https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads
>
> HBase also has a utility for running some evaluations via MR or a 
> thread based client:
>
> $ ./hbase org.apache.hadoop.hbase.PerformanceEvaluation
> Usage: java org.apache.hadoop.hbase.PerformanceEvaluation \
>   <OPTIONS> [-D<property=value>]* <command> <nclients>
>
> Options:
>  nomapred        Run multiple clients using threads (rather than use
> mapreduce)
>  rows            Rows each client runs. Default: One million
>  size            Total size in GiB. Mutually exclusive with --rows.
> Default: 1.0.
>  sampleRate      Execute test on a sample of total rows. Only supported by
> randomRead. Default: 1.0
>  traceRate       Enable HTrace spans. Initiate tracing every N rows.
> Default: 0
>  table           Alternate table name. Default: 'TestTable'
>  multiGet        If >0, when doing RandomRead, perform multiple gets
> instead of single gets. Default: 0
>  compress        Compression type to use (GZ, LZO, ...). Default: 'NONE'
>  flushCommits    Used to determine if the test should flush the table.
> Default: false
>  writeToWAL      Set writeToWAL on puts. Default: True
>  autoFlush       Set autoFlush on htable. Default: False
>  oneCon          all the threads share the same connection. Default: False
>  presplit        Create presplit table. Recommended for accurate perf
> analysis (see guide).  Default: disabled
>  inmemory        Tries to keep the HFiles of the CF inmemory as far as
> possible. Not guaranteed that reads are always served from memory.
> Default: false
>  usetags         Writes tags along with KVs. Use with HFile V3. Default:
> false
>  numoftags       Specify the no of tags that would be needed. This works
> only if usetags is true.
>  filterAll       Helps to filter out all the rows on the server side there
> by not returning any thing back to the client.  Helps to check the 
> server side performance.  Uses FilterAllFilter internally.
>  latency         Set to report operation latencies. Default: False
>  bloomFilter      Bloom filter type, one of [NONE, ROW, ROWCOL]
>  valueSize       Pass value size to use: Default: 1024
>  valueRandom     Set if we should vary value size between 0 and
> 'valueSize'; set on read for stats on size: Default: Not set.
>  valueZipf       Set if we should vary value size between 0 and 'valueSize'
> in zipf form: Default: Not set.
>  period          Report every 'period' rows: Default: opts.perClientRunRows
> / 10
>  multiGet        Batch gets together into groups of N. Only supported by
> randomRead. Default: disabled
>  addColumns      Adds columns to scans/gets explicitly. Default: true
>  replicas        Enable region replica testing. Defaults: 1.
>  splitPolicy     Specify a custom RegionSplitPolicy for the table.
>  randomSleep     Do a random sleep before each get between 0 and entered
> value. Defaults: 0
>  columns         Columns to write per row. Default: 1
>  caching         Scan caching to use. Default: 30
>
>  Note: -D properties will be applied to the conf used.
>   For example:
>    -Dmapreduce.output.fileoutputformat.compress=true
>    -Dmapreduce.task.timeout=60000
>
> Command:
>  filterScan      Run scan test using a filter to find a specific row based
> on it's value (make sure to use --rows=20)
>  randomRead      Run random read test
>  randomSeekScan  Run random seek and scan 100 test
>  randomWrite     Run random write test
>  scan            Run scan test (read every row)
>  scanRange10     Run random seek scan with both start and stop row (max 10
> rows)
>  scanRange100    Run random seek scan with both start and stop row (max 100
> rows)
>  scanRange1000   Run random seek scan with both start and stop row (max
> 1000 rows)
>  scanRange10000  Run random seek scan with both start and stop row 
> (max
> 10000 rows)
>  sequentialRead  Run sequential read test  sequentialWrite Run 
> sequential write test
>
> Args:
>  nclients        Integer. Required. Total number of clients (and
> HRegionServers)
>                  running: 1 <= value <= 500
> Examples:
>  To run a single evaluation client:
>  $ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation 
> sequentialWrite
> 1
>
>
>
> --
> Iain Wright
>
> This email message is confidential, intended only for the recipient(s) 
> named above and may contain information that is privileged, exempt 
> from disclosure under applicable law. If you are not the intended 
> recipient, do not disclose or disseminate the message to anyone except 
> the intended recipient. If you have received this message in error, or 
> are not the named recipient(s), please immediately notify the sender 
> by return email, and delete all copies of this message.
>
> On Tue, Dec 22, 2015 at 11:12 AM, Kumiko Yada <Kumiko.Yada@ds-iq.com>
> wrote:
>
> > For to add that I don't want to the bulkinsert for this test.
> >
> > Thanks
> > Kumiko
> >
> > -----Original Message-----
> > From: Kumiko Yada [mailto:Kumiko.Yada@ds-iq.com]
> > Sent: Tuesday, December 22, 2015 11:01 AM
> > To: user@hbase.apache.org
> > Subject: Put performance test
> >
> > Hello,
> >
> > I wrote the python script w/ happybase library to do the performance 
> > put test; however, this library is crashing when more than 900000 
> > rows are put.  I'd like to do 1/10/100 million rows put tests.  Is 
> > there any tool that I can use for this?
> >
> > Thanks
> > Kumiko
> >
>
Mime
View raw message