hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
Date Fri, 30 Mar 2012 05:29:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242076#comment-13242076

stack commented on HBASE-5604:

I like the name WALPlayer.

How is this related to Import at all?  (Import imports WAL files?)

Does Import write HFiles?

Does the hdfs mod time get updated when you close the files?  If so, that might work for when
the file is in postion but won't work if file gets moved to .oldlogs or to archive?  Should
we add a tail on a WAL with metadata of fixed size so you know where to start reading to pick
it up?  I suppose you can't rely on the fact that all WALs are present?  If they were, you
could use the start date of the next WAL (after sorting them by date) as the ending date of
the current file.

Should we rename WALs on close so they have the start and end time as their name?

> HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
> ------------------------------------------------------------------------
>                 Key: HBASE-5604
>                 URL: https://issues.apache.org/jira/browse/HBASE-5604
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based
on time before the M/R job is started and then have a mapper per HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't fit into the
time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message