hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10418) give blocks of smaller store files priority in cache
Date Sat, 25 Jan 2014 01:41:38 GMT
Sergey Shelukhin created HBASE-10418:

             Summary: give blocks of smaller store files priority in cache
                 Key: HBASE-10418
                 URL: https://issues.apache.org/jira/browse/HBASE-10418
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: Sergey Shelukhin

That's just an idea at this point, I don't have a patch nor plan to make one in near future.
It's good for datasets that don't fit in memory especially; and if scans are involved. 
Scans (and gets in absence of bloom filters' help) have to read from all store files. Short
range request will hit one block in every file.
If small files are more likely to be entirely available in memory, on average requests will
hit less blocks from FS. 
For scans that read a lot of data, it's better to read blocks in sequence from a big file
and blocks for small files from cache, rather than a mix of FS and cached blocks from different
files, because the (HBase) blocks of a big file would be sequential in one HDFS block.

This message was sent by Atlassian JIRA

View raw message