hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12031) Parallel Scanners inside Region
Date Fri, 26 Sep 2014 23:14:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150173#comment-14150173
] 

stack commented on HBASE-12031:
-------------------------------

bq. HFileContext carries the information on some of the meta data about the HFile - this is
per HFile. FielReaderContext is per Reader (Scanner).
bq. Probably, yes but need a place (class) to move the actual read to.

Can it not be done inside existing model via changes to HFileBlock and/or changes in scanners
rather than doing this 'cross-cut'?

bq.  Not all data from read ahead buffer should be cached in a general case. Sharing some
data between scanners in RA buffer is not a common case.

Tell me more about the use case then?  Many scanners inside the same region but they will
not be scanning same files?  RA buffers per scanner will need to be accounted for so we can
measure mem usage. Since we read in the block, why not go via blockcache unless for sure it
is one-shot only?

Will readahead be a scanner option?

Good stuff [~vrodionov]




> Parallel Scanners inside Region
> -------------------------------
>
>                 Key: HBASE-12031
>                 URL: https://issues.apache.org/jira/browse/HBASE-12031
>             Project: HBase
>          Issue Type: New Feature
>          Components: Performance, Scanners
>    Affects Versions: 0.98.6
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 1.0.0, 2.0.0, 0.98.7, 0.99.1
>
>         Attachments: HBASE-12031.2.patch, HBASE-12031.3.patch, HBASE-12031.patch, ParallelScannerDesign.pdf,
hbase-12031-tests.tar.gz
>
>
> This JIRA to improve performance of multiple scanners running on a same region in parallel.
The scenarios where we will get the performance benefits:
> * New TableInputFormat with input splits smaller than HBase Region.
> * Scanning during compaction (Compaction scanner and application scanner over the same
Region).
> Some JIRAs related to this one:
> https://issues.apache.org/jira/browse/HBASE-7336
> https://issues.apache.org/jira/browse/HBASE-5979 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message