hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiru Pakkirisamy <kirupakkiris...@yahoo.com>
Subject Re: Client Get vs Coprocessor scan performance
Date Fri, 09 Aug 2013 05:58:57 GMT
Awesome..awesome. Thanks.
Will soon report back.
 
Regards,
- kiru


Kiru Pakkirisamy | webcloudtech.wordpress.com


________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Cc: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Thursday, August 8, 2013 10:44 PM
Subject: Re: Client Get vs Coprocessor scan performance
 

I think you need HBASE-6870 which went into 0.94.8

Upgrading should boost coprocessor performance. 

Cheers

On Aug 8, 2013, at 10:21 PM, Kiru Pakkirisamy <kirupakkirisamy@yahoo.com> wrote:

> Ted,
> Here is the method signature/protocol
> public Map<String, Double> getFooMap<String, Double> input,
> int topN) throws IOException;
> 
> There are 31 regions on 4 nodes X 8 CPU.
> I am on 0.94.6 (from Hortonworks).
> I think it seems to behave like what linwukang says, - it is almost a full table scan
in the coprocessor. 
> Actually, when I set more specific ColumnPrefixFilters performance went down.
> I want to do things on the server side because, I dont want to be sending 500K column/values
to the client.
> I cannot believe a single-threaded client which does some calculations and group-by 
beats the coprocessor running in 31 regions.
>  
> Regards,
> - kiru
> 
> 
> Kiru Pakkirisamy | webcloudtech.wordpress.com
> 
> 
> ________________________________
> From: Ted Yu <yuzhihong@gmail.com>
> To: user@hbase.apache.org; Kiru Pakkirisamy <kirupakkirisamy@yahoo.com> 
> Sent: Thursday, August 8, 2013 8:40 PM
> Subject: Re: Client Get vs Coprocessor scan performance
> 
> 
> Can you give us a bit more information ?
> 
> How do you deliver the 55 rowkeys to your endpoint ?
> How many regions do you have for this table ?
> 
> What HBase version are you using ?
> 
> Thanks
> 
> On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy
> <kirupakkirisamy@yahoo.com>wrote:
> 
>> Hi,
>> I am finding an odd behavior with the Coprocessor performance lagging a
>> client side Get.
>> I have a table with 500000 rows. Each have variable # of columns in one
>> column family (in this case about 600000 columns in total are processed)
>> When I try to get specific 55 rows, the client side completes in half-the
>> time as the coprocessor endpoint.
>> I am using  55 RowFilters on the Coprocessor scan side. The rows are
>> processed are exactly the same way in both the cases.
>> Any pointers on how to debug this scenario ?
>> 
>> Regards,
>> - kiru
>> 
>> 
>> Kiru Pakkirisamy | webcloudtech.wordpress.com
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message