hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-4496) HFile V2 does not honor setCacheBlocks when scanning.
Date Tue, 27 Sep 2011 07:23:12 GMT
HFile V2 does not honor setCacheBlocks when scanning.

                 Key: HBASE-4496
                 URL: https://issues.apache.org/jira/browse/HBASE-4496
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.92.0, 0.94.0
            Reporter: Lars Hofhansl
             Fix For: 0.92.0, 0.94.0

While testing the LRU cache during the scanning I noticed quite some churn in the cache even
when Scan.cacheBlocks is set to false. After debugging this, I found that HFile V2 always
caches blocks in the LRU cache regardless of the cacheBlocks setting.

Here's a trace (from Eclipse) showing the problem:

HFileReaderV2.readBlock(long, int, boolean, boolean, boolean) line: 279	
HFileReaderV2.readBlockData(long, long, int, boolean) line: 219	
HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, HFileBlock) line: 191	
HFileReaderV2$ScannerV2.seekTo(byte[], int, int, boolean) line: 502	
HFileReaderV2$ScannerV2.reseekTo(byte[], int, int) line: 539	
StoreFileScanner.reseekAtOrAfter(HFileScanner, KeyValue) line: 151	
StoreFileScanner.reseek(KeyValue) line: 110	
KeyValueHeap.reseek(KeyValue) line: 255	
StoreScanner.reseek(KeyValue) line: 409	
StoreScanner.next(List<KeyValue>, int) line: 304	
KeyValueHeap.next(List<KeyValue>, int) line: 114	
KeyValueHeap.next(List<KeyValue>) line: 143	
HRegion$RegionScannerImpl.nextRow(byte[]) line: 2774	
HRegion$RegionScannerImpl.nextInternal(int) line: 2722	
HRegion$RegionScannerImpl.next(List<KeyValue>, int) line: 2682	
HRegion$RegionScannerImpl.next(List<KeyValue>) line: 2699	
HRegionServer.next(long, int) line: 2092	

Every scanner.next causes a reseek, which eventually causes a call to HFileBlockIndex$BlockIndexReader.seekToDataBlock(...)
at which point the cacheBlocks information is lost. HFileReaderV2.readBlockData calls HFileReaderV2.readBlock
with cacheBlocks set unconditionally to true.

The fix is not immediately clear, unless we want to pass cacheBlocks to HFileBlockIndex$BlockIndexReader.seekToDataBlock
and then on to HFileBlock.BasicReader.readBlockData and all its implementers, which is ugly
as readBlockData should not care about caching.

Avoiding caching during scans is somewhat important for us.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message