hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yun peng <pengyunm...@gmail.com>
Subject Could it improve read performance by storing HFile consecutive on disk?
Date Tue, 09 Jul 2013 15:49:27 GMT
In our use case memory/cache is small, and we want to improve read/load
(from-disk) performance by storing HFile blocks consecutively on disk...
The idea is that if we store blocks more closely on disk, then read a data
block from HFile would require fewer random disk access.

In particular, to lookup a value or to read a data block in HFile, it needs
the b-tree style root-to-leaf traversal. For each step in a traversal, it
needs load block from disk. Since the blocks along the root-to-leaf path
are not stored consecutively, those reads are typically random. I am not
sure if we can store all the block in a root-to-leaf path in a consecutive
disk area, then we can translate random reads to sequential reads, which
should be faster.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message