hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14417) Incremental backup and bulk loading
Date Mon, 20 Mar 2017 18:59:41 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-14417:
---------------------------
    Description: 
Currently, incremental backup is based on WAL files. Bulk data loading bypasses WALs for obvious
reasons, breaking incremental backups. The only way to continue backups after bulk loading
is to create new full backup of a table. This may not be feasible for customers who do bulk
loading regularly (say, every day).

Here is the review board (out of date):
https://reviews.apache.org/r/54258/

In order not to miss the hfiles which are loaded into region directories in a situation where
postBulkLoadHFile() hook is not called (bulk load being interrupted), we record hfile names
thru preCommitStoreFile() hook.
At time of incremental backup, we check the presence of such hfiles. If they are present,
they become part of the incremental backup image.

Here is review board:
https://reviews.apache.org/r/57790/

Google doc for design:
https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE

  was:
Currently, incremental backup is based on WAL files. Bulk data loading bypasses WALs for obvious
reasons, breaking incremental backups. The only way to continue backups after bulk loading
is to create new full backup of a table. This may not be feasible for customers who do bulk
loading regularly (say, every day).

Here is the review board (out of date):
https://reviews.apache.org/r/54258/

In order not to miss the hfiles which are loaded into region directories in a situation where
postBulkLoadHFile() hook is not called (bulk load being interrupted), we record hfile names
thru preCommitStoreFile() hook.
At time of incremental backup, we check the presence of such hfiles. If they are present,
they become part of the incremental backup image.

Google doc for design:
https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE


> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Blocker
>              Labels: backup
>             Fix For: 2.0
>
>         Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, 14417-tbl-ext.v14.txt,
14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt,
14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt,
14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading bypasses WALs
for obvious reasons, breaking incremental backups. The only way to continue backups after
bulk loading is to create new full backup of a table. This may not be feasible for customers
who do bulk loading regularly (say, every day).
> Here is the review board (out of date):
> https://reviews.apache.org/r/54258/
> In order not to miss the hfiles which are loaded into region directories in a situation
where postBulkLoadHFile() hook is not called (bulk load being interrupted), we record hfile
names thru preCommitStoreFile() hook.
> At time of incremental backup, we check the presence of such hfiles. If they are present,
they become part of the incremental backup image.
> Here is review board:
> https://reviews.apache.org/r/57790/
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message