hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatsuya Kawano <tatsuya6...@gmail.com>
Subject Re: Items to contribute (plan)
Date Wed, 26 Jan 2011 11:42:47 GMT

Hi Stack, 


On Jan 25, 2011, Stack wrote:

>> 2. mapreduce.HFileInputFormat
>> 
>> MR library to read data directly from HFiles. (Roughly 2.5 times faster than TableInputFormat
in my tests)
>> 
>> Current status: Completed a proof-of-concept prototype and measured performance.
>> 
> 
> What about the in-memory edits?  Or you thinking of reading the WALs too?

My prototype doesn't read in-memory edits. So you have to flush the table before running your
MR job. 

To read in-memory edits, I would create a special scanner in RS which reads KeyValues only
from MemTable. I'll also add observer to RS to watch region flush event.

Also, my prototype doesn't deal with region compactions so the MR job will fail if the compaction
threads delete old HFiles after minor/major compaction. I need to find a solution for this
too.


- Tatsuya

--
Tatsuya Kawano (Mr.)
Tokyo, Japan

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message