hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-17910) Use separated StoreFileReader for streaming read
Date Thu, 13 Apr 2017 07:18:41 GMT
Duo Zhang created HBASE-17910:

             Summary: Use separated StoreFileReader for streaming read
                 Key: HBASE-17910
                 URL: https://issues.apache.org/jira/browse/HBASE-17910
             Project: HBase
          Issue Type: Bug
            Reporter: Duo Zhang

For now we have already supportted using private readers for compaction, by creating a new
StoreFile copy. I think a better way is to allow creating multiple readers from a single StoreFile
instance, thus we can avoid the ugly cloning, and the reader can also be used for streaming
scan, not only for compaction.

The reason we want to do this is that, we found a read amplification when using short circult
read. {{BlockReaderLocal}} will use an internal buffer to read data first, the buffer size
is based on the configured buffer size and the readahead option in CachingStrategy. For normal
pread request, we should just bypass the buffer, this can be achieved by setting readahead
to 0. But for streaming read I think the buffer is somehow still useful? So we need to use
different FSDataInputStream for pread and streaming read.

And one more thing is that, we can also remove the streamLock if streaming read always use
its own reader.

This message was sent by Atlassian JIRA

View raw message