hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: independent scans to same region processed serially
Date Sat, 09 Feb 2013 10:48:39 GMT
What do you see in the thread dump?  May be HBASE-7336 deals with scans
hitting the same block of data. But i see from your mail that the scans are
independent of each other and they scan different data but in the same
Region.

Regards
Ram

On Sat, Feb 9, 2013 at 11:22 AM, James Taylor <jtaylor@salesforce.com>wrote:

> All data is the blockcache and there are plenty of handlers. To repro, you
> could:
> - create a table pre-split into, for example, three regions
> - execute serially a scan on the middle region
> - execute two parallel scans each on half of the middle region
> - you'd expect the parallel scan to execute near twice as fast, but we're
> seeing it execute slower than the serial scan.
> We're using the same HConnection with different HTable instances for each
> scan.
>
>     James
>
>
> On 02/08/2013 06:51 PM, lars hofhansl wrote:
>
>> Is your data all in the blockcache, otherwise you might have run into
>> HBASE-7336 (https://issues.apache.org/**jira/browse/HBASE-7336).Fixed<https://issues.apache.org/jira/browse/HBASE-7336).Fixed>0.94.4.
>> I assume you have enough handlers, etc. (i.e. does the same happen if
>> issue multiple scan request across different region of the same region
>> server?)
>>
>>
>> -- Lars
>>
>>
>>
>> ______________________________**__
>>   From: James Taylor <jtaylor@salesforce.com>
>> To: HBase User <user@hbase.apache.org>
>> Sent: Friday, February 8, 2013 5:49 PM
>> Subject: independent scans to same region processed serially
>>   Wanted to check with folks and see if they've seen an issue around this
>> before digging in deeper. I'm on 0.94.2. If I execute in parallel multiple
>> scans to different parts of the same region, they appear to be processed
>> serially. It's actually faster from the client side to execute a single
>> serial scan than it is to execute multiple parallel scans to different
>> segments of the region. I do have region observer coprocessors for the
>> table I'm scanning, but my code is not doing any synchronization.
>>
>> Is there a known limitation in this area? Anyone else see anything
>> similar?
>>
>>      James
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message