hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiru Pakkirisamy <kirupakkiris...@yahoo.com>
Subject Re: Client Get vs Coprocessor scan performance
Date Fri, 09 Aug 2013 05:21:46 GMT
Here is the method signature/protocol
public Map<String, Double> getFooMap<String, Double> input,
int topN) throws IOException;

There are 31 regions on 4 nodes X 8 CPU.
I am on 0.94.6 (from Hortonworks).
I think it seems to behave like what linwukang says, - it is almost a full table scan in the
Actually, when I set more specific ColumnPrefixFilters performance went down.
I want to do things on the server side because, I dont want to be sending 500K column/values
to the client.
I cannot believe a single-threaded client which does some calculations and group-by  beats
the coprocessor running in 31 regions.
- kiru

Kiru Pakkirisamy | webcloudtech.wordpress.com

 From: Ted Yu <yuzhihong@gmail.com>
To: user@hbase.apache.org; Kiru Pakkirisamy <kirupakkirisamy@yahoo.com> 
Sent: Thursday, August 8, 2013 8:40 PM
Subject: Re: Client Get vs Coprocessor scan performance

Can you give us a bit more information ?

How do you deliver the 55 rowkeys to your endpoint ?
How many regions do you have for this table ?

What HBase version are you using ?


On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy

> Hi,
> I am finding an odd behavior with the Coprocessor performance lagging a
> client side Get.
> I have a table with 500000 rows. Each have variable # of columns in one
> column family (in this case about 600000 columns in total are processed)
> When I try to get specific 55 rows, the client side completes in half-the
> time as the coprocessor endpoint.
> I am using  55 RowFilters on the Coprocessor scan side. The rows are
> processed are exactly the same way in both the cases.
> Any pointers on how to debug this scenario ?
> Regards,
> - kiru
> Kiru Pakkirisamy | webcloudtech.wordpress.com
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message