hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mridul Muralidharan <mrid...@yahoo-inc.com>
Subject FW: Read op question
Date Fri, 08 Jan 2010 21:40:52 GMT
A collegue is unable to send this mail to the list, so proxying it.
Thanks in advance for the responses !



I'm trying to better understand the flow of the client read operation in 
HBase.  I've been looking at a combination of the HBase documents, Lars 
George's summary (very nice), the javadocs, and the BigTable paper.

My understanding from Lars and the documentation is that a given record 
maps to a single HRegion and a single Store on that HRegion.  Writes to 
that record are buffered in the MemStore.  When a MemStore is full, it 
is flushed to HDFS as an HFile.

My understanding is that if the record is updated multiple times, these 
updates may be stored in different HFiles.  The BigTable paper mentions 
this specifically, and I infer this from the HBase documentation too.

So my question is, what indexing is present on an HRegion to support a 
read of a single record?  Aside from looking in the MemStore, how do you 
know what HFiles to read?  On opening an HFile, do you scan the whole thing?

Thanks for any details, including pointers to class names.


View raw message