hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5979) Non-pread DFSInputStreams should be associated with scanners, not HFile.Readers
Date Tue, 22 May 2012 00:17:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280640#comment-13280640

Todd Lipcon commented on HBASE-5979:

Hey Kannan,

Sorry, let me elaborate on that suggestion:

The idea is to make a new FSReader implementation, which only has one API. That API would
look like the current positional read call (i.e take a position and length).

Internally, it would have a pool of cached DFSInputStreams, and remember the position for
each of them. Each of the input streams would be referencing the same file. When a read request
comes in, it is matched against the pooled streams: if it is within N bytes forward from the
current position of one of the streams, then a seek and read would be issued, synchronized
on that stream. Otherwise, any random stream would be chosen and a position read would be
chosen. Separately, we can track the last N positional reads: if we detect a sequential pattern
in the position reads, we can take one of the pooled input streams and seek to the next predicted
offset, so that future reads get the sequential benefit.
> Non-pread DFSInputStreams should be associated with scanners, not HFile.Readers
> -------------------------------------------------------------------------------
>                 Key: HBASE-5979
>                 URL: https://issues.apache.org/jira/browse/HBASE-5979
>             Project: HBase
>          Issue Type: Improvement
>          Components: performance, regionserver
>            Reporter: Todd Lipcon
> Currently, every HFile.Reader has a single DFSInputStream, which it uses to service all
gets and scans. For gets, we use the positional read API (aka "pread") and for scans we use
a synchronized block to seek, then read. The advantage of pread is that it doesn't hold any
locks, so multiple gets can proceed at the same time. The advantage of seek+read for scans
is that the datanode starts to send the entire rest of the HDFS block, rather than just the
single hfile block necessary. So, in a single thread, pread is faster for gets, and seek+read
is faster for scans since you get a strong pipelining effect.
> However, in a multi-threaded case where there are multiple scans (including scans which
are actually part of compactions), the seek+read strategy falls apart, since only one scanner
may be reading at a time. Additionally, a large amount of wasted IO is generated on the datanode
side, and we get none of the earlier-mentioned advantages.
> In one test, I switched scans to always use pread, and saw a 5x improvement in throughput
of the YCSB scan-only workload, since it previously was completely blocked by contention on
the DFSIS lock.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message