hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-61) [hbase] Create an HBase-specific MapFile implementation
Date Tue, 24 Feb 2009 00:38:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676145#action_12676145
] 

stack commented on HBASE-61:
----------------------------

I'm +1 on a commit (All tests pass for me).  There is work to do stil integration -- in particular
mapping the HColumnDescriptor configurations to match new hfile for bloomfilters, compression,
and blocksizing -- but I'd suggest we do these as separate issues; the patch is big enough
already.

Primitive performance eval. shows random reads up by about 60%, writes up about 25% but scans
are down.  Will do some profiling over next few days.

Other notes on the patch:

+ The change to hbase-site.xml is not yet hooked up.
+ This patch breaks binary keys because it undoes the ugly stuff we did to make them work.
 Will fix again when we address hbase-859 -- thats next.  In other words, this patch has already
started the reworking of HStoreKey removing all the crap where every key had a HREgionInfo
reference.  One thing in particular that it adds is rawcomparator comparing store keys; that
is, no object instantiation.. pure byte compare).
+ The patch is basically a rewrite from HStore down.  A few files were renamed because they
changed so much -- HStore becomes Store, HStoreFile becomes StoreFile, etc.
+ Some pieces of this patch are taken from tfile, hadoop-3315.  In particular the hfile tests
and much of the compression facility: e.g. BoundedRangeFileInputStream, and Compression types.
+ A few files are missing apache license -- we can add one when we commit (simple block cache).

> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>
>                 Key: HBASE-61
>                 URL: https://issues.apache.org/jira/browse/HBASE-61
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io
>            Reporter: Bryan Duxbury
>            Assignee: ryan rawson
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: cpucalltreetfile.html, HBASE-83.patch, hfile.patch, hfile2.patch,
hfile3.patch, longestkey.patch, tfile.patch, tfile3.patch
>
>
> Today, HBase uses the Hadoop MapFile class to store data persistently to disk. This is
convenient, as it's already done (and maintained by other people :). However, it's beginning
to look like there might be possible performance benefits to be had from doing an HBase-specific
implementation of MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features might be included
in such an implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message