hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: New hbase file format (HBASE-61)
Date Sat, 07 Feb 2009 22:19:40 GMT
One of the important features of rfile is the eschewing of streaming, and
the realization that given HBase's memcache, every key and value must fit in
ram at least once.  So by stripping down complexity, and going with a
block-oriented read, it also makes reliable and massive block caching a easy
reality.  With a unbounded soft-ref-style block cache, hbase can use as much
ram as you throw at it via the -Xmx parameter to cache blocks and improve

Stack has posted those performance numbers that validate that a simpler
approach = faster.  There are many tuning parameters (block size vs expected
key size being one) that will affect and improve performance, and other
features that are addable.

My goal has been to improve HBase's end-user read performance by 50-100x.
I'm hoping that with rfile this becomes a reality.


On Fri, Feb 6, 2009 at 5:37 PM, stack <stack@duboce.net> wrote:

> Ryan Rawson and I have been running performance experiments where we swap
> out MapFile and put in its place a customized HADOOP-3315 tfile and a
> format
> that Ryan wrote himself named rfile.  Its looking like Ryans' rfile has
> many
> benefits over our current MapFile based format and that we'll likely move
> on
> to it over the next week or so.  If you're interested, see toward the end
> of
> HBASE-61, the new file format discussion doc
> http://wiki.apache.org/hadoop/Hbase/NewFileFormat, and the coarse
> performance stats here:
> http://wiki.apache.org/hadoop/Hbase/NewFileFormat/Performance.
> Comments welcome either here or up in the issue,
> Thanks,
> St.Ack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message