hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9272) A simple parallel, unordered scanner
Date Tue, 20 Aug 2013 18:04:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745205#comment-13745205
] 

Lars Hofhansl commented on HBASE-9272:
--------------------------------------

More like a MultiGet that farms requests out to multiple RegionServers at the same time, although
I am using a different threading model (fixed number of threads and an unbounded waiting queue,
rather than the reverse).
There are a lot of options. Right now each region becomes a task and is scheduled on a threadpool.
Could also group by regionserver.
Obviously this only makes sense when the scan will touch a "reasonable" number of a regions.
                
> A simple parallel, unordered scanner
> ------------------------------------
>
>                 Key: HBASE-9272
>                 URL: https://issues.apache.org/jira/browse/HBASE-9272
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> The contract of ClientScanner is to return rows in sort order, that limits the order
in which region can be scanned.
> I propose a simple ParallelScanner that does not have this requirement and queries regions
in parallel, return whatever gets returned first.
> This is generally useful for scans that filter a lot of data on the server, or in cases
where the client can very quickly react to the returned data.
> I have a simple prototype (doesn't error handling right, and might be a bit heavy on
the synchronization side - it used a BlockingQueue to hand data between the client using the
scanner and the threads doing the scanner, it also could potentially starve some scanners
long enugh to time out at the server).
> On the plus side, it's only a 130 lines of code. :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message