hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatsuya Kawano <tatsuya6...@gmail.com>
Subject Re: Items to contribute (plan)
Date Wed, 26 Jan 2011 03:34:33 GMT

Hi Yifeng, 

> #4. Writing Japanese books and documents
> I am glad if I can work on this one with you.


Thanks for your offer. Let me explain a bit more about them. 


>> -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL book

This one is a commercial book from a Japanese publisher, so I'll do this by myself.


>> -- I'll translate The Apache HBase Book to Japanese


This one comes with HBase, and I'm looking for some people (like you) to work with.

http://hbase.apache.org/book.html

I created a Jira entry to track this task: 
https://issues.apache.org/jira/browse/HBASE-3391


Are you working at Rakuten in Tokyo? Maybe we can meet at next Hadoop Source Code Reading
at Rakuten Tower. Do you know this event? 

Thanks, 
Tatsuya

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Jan 25, 2011, at 11:03 AM, Yifeng Jiang <yifeng.jiang@mail.rakuten.co.jp> wrote:

> #4. Writing Japanese books and documents
> I am glad if I can work on this one with you.
> 
> 
> On 01/23/2011 10:18 AM, Tatsuya Kawano wrote:
>> Hi,
>> 
>> I wanted to let you know that I'm planning to contribute the following items to the
HBase community. These are my spare time projects and I'll only be able to spend my time about
7 hours a week, so the progress will be very slow. I want some feedback from you guys to prioritize
them. Also, if someone/team wants to work on them (with me or alone), I'll be happy to provide
more details.
>> 
>> 
>> 1. RADOS integration
>> 
>> Run HBase not only on HDFS but also RADOS distributed object store (the lower layer
of Ceph), so that the following options will become available to HBase users:
>> 
>> -- No SPOF (RADOS doesn't have the name node(s), but only ZK-like monitors and data
nodes)
>> -- Instant backup of HBase tables (RADOS provides copy-on-write snapshot per object
pool)
>> -- Extra durability option on WAL (RADOS can do both synchronous and asynchronous
disk flush. HDFS doesn't have the earlier option)
>> 
>> Note:
>> RADOS object = HFile, WAL
>> object pool = group of HFiles or WAL
>> 
>> Current status: Design phase
>> 
>> 
>> 2. mapreduce.HFileInputFormat
>> 
>> MR library to read data directly from HFiles. (Roughly 2.5 times faster than TableInputFormat
in my tests)
>> 
>> Current status: Completed a proof-of-concept prototype and measured performance.
>> 
>> 
>> 3. Enhance Get/Scan performance of RS
>> 
>> Add an hash code and a couple of flags to HFile at the flush time and change scanner
implementation so that:
>> 
>> -- Get/Scan operations will get faster. (less key comparisons for reconstructing
a row: O(h * c) ->  O(h).  [h = number of HFiles for the row, c = number of columns in
an HFile])
>> -- The size of HFiles will become a bit smaller. (The flags will eliminate duplicate
bytes in keys (row, column family and qualifier) from HFiles.)
>> 
>> Current status: Completed a proof-of-concept prototype and measured performance.
>> 
>> Detals:
>> https://github.com/tatsuya6502/hbase-mr-pof/
>> (I meant "poc" not "pof"...)
>> 
>> 
>> 4. Writing Japanese books and documents
>> 
>> -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL book
>> -- I'll translate The Apache HBase Book to Japanese
>> 
>> 
>> Thank you,
>> 
>> 
>> --
>> Tatsuya Kawano (Mr.)
>> Tokyo, Japan
>> 
>> http://twitter.com/#!/tatsuya6502
>> 
>> 
>> 
> 
> 
> -- 
> Yifeng Jiang
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message