hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Kang <weliam.cl...@gmail.com>
Subject HBase random access in HDFS and block indices
Date Tue, 19 Oct 2010 02:48:16 GMT
Recently I have spent some efforts to try to understand the mechanisms
of HBase to exploit possible performance tunning options. And many
thanks to the folks who helped with my questions in this community, I
have sent a report. But, there are still few questions left.

1. If a HFile block contains more than one keyvalue pair, will the
block index in HFile point out the offset for every keyvalue pair in
that block? Or, the block index will just point out the key ranges
inside that block, so you have to traverse inside the block until you
meet the key you are looking for?

2. When HBase read block to fetching the data or traverse in it, is
this block read into memory?

3. HBase blocks (64k configurable) are inside HDFS blocks (64m
configurable), to read the HBase blocks, we have to random access the
HDFS blocks. Even HBase can use in(p, buf, 0, x) to read a small
portion of the larger HDFS blocks, it is still a random access. Would
this be slow?

Many thanks. I would be grateful for your answers.


View raw message