hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1935) Scan in parallel
Date Sun, 27 Feb 2011 05:31:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999886#comment-12999886
] 

stack commented on HBASE-1935:
------------------------------

@Otis I'd imagine parallel scanning would make for some very nice speed improvement especially
in case where you have a pretty severe filter running serverside that skips most rows returning
a few only.  The patch attached has a bit of a wonky history.  I'll spare you the details.
 Its way stale too I'd say, I haven't tried it, and may depend on behavior on server side
since squashed (again, haven't verified).  Also, the support for parallel scanning should
be moved into HTable I'd say (as someone above says) rather than have it out in a new ParallelHTable
class.  The patch has very nice tests though.

> Scan in parallel
> ----------------
>
>                 Key: HBASE-1935
>                 URL: https://issues.apache.org/jira/browse/HBASE-1935
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors
>            Reporter: stack
>             Fix For: 0.92.0
>
>         Attachments: pscanner-v2.patch, pscanner-v3.patch, pscanner-v4.patch, pscanner.patch
>
>
> A scanner that rather than scan in series, instead scanned multiple regions in parallell
would be more involved but could complete much faster partiularly if results are sparse.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message