hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rodionov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14142) HBase Backup/Restore Phase 3: Edits deduplication during backup
Date Tue, 07 Jun 2016 01:27:21 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vladimir Rodionov updated HBASE-14142:
--------------------------------------
    Summary: HBase Backup/Restore Phase 3: Edits deduplication during backup  (was: HBase
Backup/Restore Phase 3: Cells deduplication during backup)

> HBase Backup/Restore Phase 3: Edits deduplication during backup
> ---------------------------------------------------------------
>
>                 Key: HBASE-14142
>                 URL: https://issues.apache.org/jira/browse/HBASE-14142
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> As since we do not record last backed up sequence ids (MVCC) and do not restore up to
that sequence id - that is kind of tricky, there will be some duplicates of KVs in store files
after first incremental restore after full backup. These duplicates are result of how we do
full backup and first incremental backup after full one. During full backup we perform distributed
log roll and record, for every RS, last WAL timestamp, then we do snapshot. The next WAL after
recorded one will make it into a next incremental backup set, but it will contains some edits
(puts, deletes) which have been recorded by a previous snapshot. During restore, we, first,
restore snapshot, then we will re-play WALs and this operation can create some duplicates
of KVs in different store files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message