hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Ranganathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6874) Implement prefetching for scanners
Date Tue, 06 Nov 2012 18:42:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491695#comment-13491695

Karthik Ranganathan commented on HBASE-6874:

Lars - the dependency on HBASE-6770 is more to make the code simpler. Currently, the HRegionServer
loops over numRows, and the RegionScanner loops over the columns in the various CF's but for
one row. HBASE-6770 will move the looping on the numRows into the RegionScanner itself, because
we need to track both memory size and number of rows - in order to respect the more restrictive
of the two. Once that happens, we can implement prefetching in the RegionScanner itself, instead
of spreading the logic in HRegionServer also. So more of a code-simplicity and not having
to resolve conflicts thing.
> Implement prefetching for scanners
> ----------------------------------
>                 Key: HBASE-6874
>                 URL: https://issues.apache.org/jira/browse/HBASE-6874
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
> I did some quick experiments by scanning data that should be completely in memory and
found that adding pre-fetching increases the throughput by about 50% from 26MB/s to 39MB/s.
> The idea is to perform the next in a background thread, and keep the result ready. When
the scanner's next comes in, return the pre-computed result and issue another background read.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message