hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rodionov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12031) Parallel Scanners inside Region
Date Tue, 23 Sep 2014 20:32:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145386#comment-14145386
] 

Vladimir Rodionov commented on HBASE-12031:
-------------------------------------------

[~saint.ack@gmail.com]
{quote}
You are adding a HFIleReaderContext. We have a HFileContext already. You cannot reuse/amend
this class to your purposes?
{quote}

HFileContext carries the information on some of the meta data about the HFile - this is per
HFile. FielReaderContext is per Reader (Scanner).
{quote}
I would not expect a Context to do the actual reading as in + public boolean read(long offset,
byte[] buffer, int bufOffset, int len). I could imagine passing in a context when you read.
{quote}
Probably, yes but need a place (class) to move the actual read to.
{quote}
Regards read-ahead and keeping the buffer local to the scanner, it is not enough just having
the scanner do a read-ahead that ensures blockcache is populated? You have to bring the data
local to the scanner? If multiple concurrent scans, we'll have duplicate data buffered?
{quote}

Not sure how to implement this. Not all data from read ahead buffer should be cached in a
general case. Sharing some data between scanners in RA buffer is not a common case.

{quote}
Hard to see what you did in readAtOffset.
{quote}

If scanner is in *pread* mode -  execute OLD code for *pread*

else

if read -ahead disabled (context == null) - execute OLD code

else if streaming lock enabled  - execute OLD code

else ( Read ahead enabled && streaming lock - not enabled) - execute NEW code:

Check if we can server block from RA buffer - if yes - read & return block, otherwise
 read ahead next buffer, read block and return. 



> Parallel Scanners inside Region
> -------------------------------
>
>                 Key: HBASE-12031
>                 URL: https://issues.apache.org/jira/browse/HBASE-12031
>             Project: HBase
>          Issue Type: New Feature
>          Components: Performance, Scanners
>    Affects Versions: 0.98.6
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 1.0.0, 2.0.0, 0.98.7, 0.99.1
>
>         Attachments: HBASE-12031.2.patch, HBASE-12031.3.patch, HBASE-12031.patch, ParallelScannerDesign.pdf,
hbase-12031-tests.tar.gz
>
>
> This JIRA to improve performance of multiple scanners running on a same region in parallel.
The scenarios where we will get the performance benefits:
> * New TableInputFormat with input splits smaller than HBase Region.
> * Scanning during compaction (Compaction scanner and application scanner over the same
Region).
> Some JIRAs related to this one:
> https://issues.apache.org/jira/browse/HBASE-7336
> https://issues.apache.org/jira/browse/HBASE-5979 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message