hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-61) [hbase] Create an HBase-specific MapFile implementation
Date Thu, 05 Feb 2009 16:39:59 GMT

     [ https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-61:

    Attachment: hfile3.patch

Latest version of the hfile patch.  Scanners work properly now.  Stripped down the API.  Actually
need the SimpleBufferedInputStream  between tfile and DFSInputStream -- just with smaller
buffer size -- for sake of increased concurrency.  Also need to change how we read so we read
the whole block in rather than piecemeal it as tfile currently does.  The tfile is block based
but reads on backing stream do not pull in whole blocks; it just reads whats needed.  This
means that there is no whole block to cache if we only read a part and we're decompressing
just what we need -- so it can be faster in certain circumstance -- but this behavior frustrates
being able to cache on a block basis or more importantly decompressed blocks.

I'd work on this next but have been chatting with Ryan Rawson over last few days and he just
sent me his rfile patch.  Going to help out on that effort for a while.

> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>                 Key: HBASE-61
>                 URL: https://issues.apache.org/jira/browse/HBASE-61
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io
>            Reporter: Bryan Duxbury
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.20.0
>         Attachments: cpucalltreetfile.html, hfile.patch, hfile2.patch, hfile3.patch,
longestkey.patch, tfile.patch, tfile3.patch
> Today, HBase uses the Hadoop MapFile class to store data persistently to disk. This is
convenient, as it's already done (and maintained by other people :). However, it's beginning
to look like there might be possible performance benefits to be had from doing an HBase-specific
implementation of MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features might be included
in such an implementation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message