hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: independent scans to same region processed serially
Date Sat, 09 Feb 2013 16:49:19 GMT
HBASE-7336 only deal with parallel read on the same HFile, since each HFile only has a single
For scans you want to do seek+read (as opposed to positional reads), the problem with seek+read
is that is that can only be done with the single thread.
So HBASE-7336 just switches the read to a positional read if the reader is already locked.
(somewhat of a hack)

-- Lars

From: ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>
To: user@hbase.apache.org 
Sent: Saturday, February 9, 2013 2:48 AM
Subject: Re: independent scans to same region processed serially

What do you see in the thread dump?  May be HBASE-7336 deals with scans
hitting the same block of data. But i see from your mail that the scans are
independent of each other and they scan different data but in the same


On Sat, Feb 9, 2013 at 11:22 AM, James Taylor <jtaylor@salesforce.com>wrote:

> All data is the blockcache and there are plenty of handlers. To repro, you
> could:
> - create a table pre-split into, for example, three regions
> - execute serially a scan on the middle region
> - execute two parallel scans each on half of the middle region
> - you'd expect the parallel scan to execute near twice as fast, but we're
> seeing it execute slower than the serial scan.
> We're using the same HConnection with different HTable instances for each
> scan.
>     James
> On 02/08/2013 06:51 PM, lars hofhansl wrote:
>> Is your data all in the blockcache, otherwise you might have run into
>> HBASE-7336 (https://issues.apache.org/**jira/browse/HBASE-7336).Fixed<https://issues.apache.org/jira/browse/HBASE-7336).Fixed>0.94.4.
>> I assume you have enough handlers, etc. (i.e. does the same happen if
>> issue multiple scan request across different region of the same region
>> server?)
>> -- Lars
>> ______________________________**__
>>   From: James Taylor <jtaylor@salesforce.com>
>> To: HBase User <user@hbase.apache.org>
>> Sent: Friday, February 8, 2013 5:49 PM
>> Subject: independent scans to same region processed serially
>>   Wanted to check with folks and see if they've seen an issue around this
>> before digging in deeper. I'm on 0.94.2. If I execute in parallel multiple
>> scans to different parts of the same region, they appear to be processed
>> serially. It's actually faster from the client side to execute a single
>> serial scan than it is to execute multiple parallel scans to different
>> segments of the region. I do have region observer coprocessors for the
>> table I'm scanning, but my code is not doing any synchronization.
>> Is there a known limitation in this area? Anyone else see anything
>> similar?
>>      James

View raw message