hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
Date Wed, 21 Mar 2012 16:51:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234485#comment-13234485
] 

Lars Hofhansl commented on HBASE-5604:
--------------------------------------

This is a part of HBase I not that familiar with.
Why is this in principle different from ImportTsv?
I guess it is because each mapper can encounter WALEdits for many tables in the HLog file(s)
that it works on...?
In the end, though, it would a reducer writing the HFiles, so the distribution of HLogs to
mappers should not matter. I think.

Hmm... Maybe this is only useful when we have a *lot* of logs to replay such as in a point
in time recovery scenario using HLogs.
Or maybe there would be no advantage here turning this in an M/R job, but maybe it should
just be a standalone client...?
                
> HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
> ------------------------------------------------------------------------
>
>                 Key: HBASE-5604
>                 URL: https://issues.apache.org/jira/browse/HBASE-5604
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based
on time before the M/R job is started and then have a mapper per HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't fit into the
time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message