hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10502) [89-fb] ParallelScanner: a client utility to perform multiple scan requests in parallel.
Date Tue, 11 Feb 2014 18:54:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898132#comment-13898132

Liyin Tang commented on HBASE-10502:

Actually HBase-9272 + HBase10502 is quite effective to optimize Join queries. Assuming a join
query such as Table A joins Table B based on row key / some prefix, then HBase-9272 is useful
to issue the initial scan in parallel to retrieve all the join keys, and then based on join
keys, multiple scan queries for Table B can be constructed and be submitted in parallel by

> [89-fb] ParallelScanner: a client utility to perform multiple scan requests in parallel.
> ----------------------------------------------------------------------------------------
>                 Key: HBASE-10502
>                 URL: https://issues.apache.org/jira/browse/HBASE-10502
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Liyin Tang
>             Fix For: 0.89-fb
> ParallelScanner is a utility class for the HBase client to perform multiple scan requests
in parallel. It requires all the scan requests having the same caching size for the simplicity
> This class provides 3 very basic functionalities: 
> * The initialize function will Initialize all the ResultScanners by calling {@link HTable#getScanner(Scan)}
in parallel for each scan request.
> * The next function will call the corresponding {@link ResultScanner#next(int numRows)}
from each scan request in parallel, and then return all the results together as a list.  Also,
if result list is empty, it indicates there is no data left for all the scanners and the user
can call {@link #close()} afterwards.
> * The close function will close all the scanners and shutdown the thread pool.

This message was sent by Atlassian JIRA

View raw message