hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-61) [hbase] Create an HBase-specific MapFile implementation
Date Fri, 06 Feb 2009 08:51:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671060#action_12671060

stack commented on HBASE-61:

Ryan checked in his rfile over here on github: http://github.com/ryanobjc/hbase-rfile/tree/master

Its up on github so more than one person can bang on it.  Notion is first to test rfile vs
tfile vs mapfile (I checked in latest hfile into github for contrast) and then whichever wins,
make a patch out of the github for this issue.

I added to github an evaluate RFile using PE.  RFile is ahead of MF it looks like using an
8k buffer and 10byte cells.  Tomorrow will do more work ensuring all files are returning what
they are supposed to and will try compare on dfs.

Talked to AJ also to day.  Suggested playing with pread -- DFSDataIS has one -- so file can
be more 'live'.  Suggested also removing buffering on DFSDIS since we're reading in blocks
and suggested we also look at receive socket buffer size -- maybe add our own socket factory
and if block size < socket receive buffer size, use the smaller.

> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>                 Key: HBASE-61
>                 URL: https://issues.apache.org/jira/browse/HBASE-61
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io
>            Reporter: Bryan Duxbury
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.20.0
>         Attachments: cpucalltreetfile.html, hfile.patch, hfile2.patch, hfile3.patch,
longestkey.patch, tfile.patch, tfile3.patch
> Today, HBase uses the Hadoop MapFile class to store data persistently to disk. This is
convenient, as it's already done (and maintained by other people :). However, it's beginning
to look like there might be possible performance benefits to be had from doing an HBase-specific
implementation of MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features might be included
in such an implementation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message