hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rodionov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-14141) HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backed up tables
Date Mon, 13 Nov 2017 18:22:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249932#comment-16249932
] 

Vladimir Rodionov edited comment on HBASE-14141 at 11/13/17 6:21 PM:
---------------------------------------------------------------------

{quote}
What happens if convertion to hfiles fails midway? I don't see cleanup (perhaps it is there
– in failBackup, but we don't seem to pass the tmp dir name.... I see that incrementalCopyHFiles
does cleanup... but don't see it in convertion of WAL to hfile).
{quote}
M/R job fails (returns non-zero), we throw exception  and we fail an operation (failBackup)
{quote}
* * Get list of WAL files eligible for incremental backup
What makes a WAL eliible for backup?
{quote}
All WAL files which have not been processed yet by backup system are considered eligible for
incremental backup
{quote}
getLogFilesFromBackupSystem gets log files from backup table. Will this be a large set. Does
Will the set be large? Will it grow w/o bound?
{quote}

Yes, it can be large, we do not have any bounds except TTL for a backup system table, which
is 1 year by default but configurable of course. This should be mentioned explicitly in a
doc, probably in a separate mini-section.

 



was (Author: vrodionov):
{quote}
What happens if convertion to hfiles fails midway? I don't see cleanup (perhaps it is there
– in failBackup, but we don't seem to pass the tmp dir name.... I see that incrementalCopyHFiles
does cleanup... but don't see it in convertion of WAL to hfile).
{quote}
M/R job fails (returns non-zero), we throw exception  and we the operation (failBackup)
{quote}
* * Get list of WAL files eligible for incremental backup
What makes a WAL eliible for backup?
{quote}
All WAL files which have not been processed yet by backup system are considered eligible for
incremental backup
{quote}
getLogFilesFromBackupSystem gets log files from backup table. Will this be a large set. Does
Will the set be large? Will it grow w/o bound?
{quote}

Yes, it can be large, we do not have any bounds except TTL for a backup system table, which
is 1 year by default but configurable of course. This should be mentioned explicitly in a
doc, probably in a separate mini-section.

 


> HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backed
up tables
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14141
>                 URL: https://issues.apache.org/jira/browse/HBASE-14141
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Blocker
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: HBASE-14141.HBASE-14123.v1.patch, HBASE-14141.v1.patch, HBASE-14141.v2.patch,
HBASE-14141.v4.patch, HBASE-14141.v5.patch, HBASE-14141.v6.patch
>
>
> h2. High level design overview
> * When incremental backup request comes for tables {t} we select all the tables already
registered in a backup system  - {T} and union them with {t}, which results in a new table
set - U(t, T)
> * For every table K from U(t,T) we perform the following:
> ** Convert new WAL files into HFile applying table filter K (only edits for table T will
pass the filter)
> ** Move these HFile(s) to backup destination
> During restore (incremental):
> * We run full restore first
> * Then collect all HFiles from intermediate incremental images and run them through HFileSplitterJob,
which splits files into a current tables region boundaries
> * Load these files using LoadIncrementalHFiles tool
>   



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message