hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu, Ming (HPIT-GADSC)" <ming.l...@hp.com>
Subject RE: Is it possible that HBase update performance is much better than read in YCSB test?
Date Thu, 13 Nov 2014 03:30:43 GMT
Thank you Andrew, this is an excellent answer, I get it now. I will try your hbase client for
a 'fair' test :-)

Best Regards,
Ming

-----Original Message-----
From: Andrew Purtell [mailto:apurtell@apache.org] 
Sent: Thursday, November 13, 2014 2:08 AM
To: user@hbase.apache.org
Cc: DeRoo, John
Subject: Re: Is it possible that HBase update performance is much better than read in YCSB
test?

Try this HBase YCSB client instead:
https://github.com/apurtell/ycsb/tree/new_hbase_client

The HBase YCSB driver in the master repo holds on to one HTable instance per driver thread.
We accumulate writes into a 12MB write buffer before flushing them en masse. This is why the
behavior you are seeing confounds your expectations. It's not correct behavior IMHO. YCSB
wants to measure the round trip of every op, not the non-cost of local caching. Worse, if
we have a lot of driver threads accumulating 12MB of edits more or less at the same rate,
then we will flush these buffers more or less at the same time and stampede the cluster, which
leads to deep valleys in observed write performance of 30-60 seconds or longer.



On Tue, Nov 11, 2014 at 8:40 PM, Liu, Ming (HPIT-GADSC) <ming.liu2@hp.com>
wrote:

> Hi, all,
>
> I am trying to use YCSB to test on our HBase 0.98.5 instance and got a 
> strange result: update is 6x better than read. It is just an exercise, 
> so the HBase is running in a workstation in standalone mode.
> I modified the workloada shipped with YCSB into two new workloads:
> workloadr and workloadu, where workloadr is do 100% read operation and 
> workloadu is do 100% update operation. At the bottom is the workloadr 
> and workloadu config files for your reference.
>
> I found out that the read performance is much worse than the update 
> performance, read is about 6000:
>
> YCSB Client 0.1
> Command line: -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloadr 
> -p columnfamily=family -s -t [OVERALL], RunTime(ms), 16565.0 
> [OVERALL], Throughput(ops/sec), 6036.824630244491
>
> And the update performance is about 36000, 6x better than read.
>
> YCSB Client 0.1
> Command line: -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloadu 
> -p columnfamily=family -s -t [OVERALL], RunTime(ms), 2767.0 [OVERALL], 
> Throughput(ops/sec), 36140.22406938923
>
> Is this possible? IMHO, read should be faster than update.
> Maybe I am wrong in the workload file? Or there is a possibility that 
> update is faster than read? I don't find a YCSB mailing list, if 
> anyone knows, please give me a link, so I can also ask question on 
> that mailing list. But is it possible that put is faster than get in 
> hbase? If not, the result must be wrong and I need to debug the YCSB 
> code to figure out what is going wrong.
>
> Workloadr:
> recordcount=100000
> operationcount=100000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=true
> readproportion=1
> updateproportion=0
> scanproportion=0
> insertproportion=0
> requestdistribution=zipfian
>
> workloadu:
> recordcount=100000
> operationcount=100000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=true
> readproportion=0
> updateproportion=1
> scanproportion=0
> insertproportion=0
> requestdistribution=zipfian
>
>
> Thanks,
> Ming
>



--
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Mime
View raw message