hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6799) Store more metadata in HFiles
Date Mon, 17 Sep 2012 23:23:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457465#comment-13457465

Lars Hofhansl commented on HBASE-6799:

Hmm... That'd be good enough, methinks. I had just looked at the FileInfo code that writes
the metadata at the end of a write.
As long as all this info is available quickly without standing up an HBase instance that's

Looks like I need to do a bit more research :)
> Store more metadata in HFiles
> -----------------------------
>                 Key: HBASE-6799
>                 URL: https://issues.apache.org/jira/browse/HBASE-6799
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
> Current we store metadata in HFile:
> * the timerange of KVs
> * the earliest PUT ts
> * max sequence id
> * whether or not this file was created from a major compaction.
> I would like to brainstorm what extra data we need to store to make an HFile self describing.
I.e. it could be backed up to somewhere with external tools (without invoking an HBase server)
can gleam enough information from it to make use of the data inside. Ideally it would also
be nice to be able to recreate .META. from a bunch of HFiles to standup a temporary HBase
instance to process a bunch of HFiles.
> What I can think of:
> * min/max key
> * table
> * column family (or families to be future proof)
> * custom tags (set by a backup tools for example)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message