hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Ranganathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6874) Implement prefetching for scanners
Date Thu, 01 Nov 2012 04:59:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488463#comment-13488463

Karthik Ranganathan commented on HBASE-6874:

Thought about the N scanners, its a complicated change - you would have to change the entire
scan protocol. Each of the next calls in scans are not numbered, and so you could go out of
whack if prefetching N (and throw in exceptions). There is also the basic issue right now
that scans do retries which is wrong. Also, reasoning about it another way, if your in memory
scan throughput is > the time to read from disk, you're probably good. I found that there
are other unrelated bottlenecks preventing this from being the case. Of course, if the filtering
is very heavy then this will breakdown... you probably want to implement prefetching based
on the num filtered rows, which should not be too hard.

I have a patch I have tested with, but its waiting on HBASE-6770 - that is going to refactor
scans quite a bit. Will put a patch out once that is done.
> Implement prefetching for scanners
> ----------------------------------
>                 Key: HBASE-6874
>                 URL: https://issues.apache.org/jira/browse/HBASE-6874
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
> I did some quick experiments by scanning data that should be completely in memory and
found that adding pre-fetching increases the throughput by about 50% from 26MB/s to 39MB/s.
> The idea is to perform the next in a background thread, and keep the result ready. When
the scanner's next comes in, return the pre-computed result and issue another background read.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message