hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10900) FULL table backup and restore
Date Thu, 19 Feb 2015 22:03:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328182#comment-14328182

Jerry He commented on HBASE-10900:

Had offline discussion with [~nidmhbase] and [~apurtell]. 
Will keep the JIRA open so that people from our team or other people can work in it in the
Thanks, Demai, Andrew.

> FULL table backup and restore
> -----------------------------
>                 Key: HBASE-10900
>                 URL: https://issues.apache.org/jira/browse/HBASE-10900
>             Project: HBase
>          Issue Type: Task
>            Reporter: Demai Ni
>         Attachments: HBASE-10900-fullbackup-trunk-v1.patch, HBASE-10900-trunk-v2.patch,
HBASE-10900-trunk-v3.patch, HBASE-10900-trunk-v4.patch
> h2. Feature Description
> This is a subtask of [HBase-7912|https://issues.apache.org/jira/browse/HBASE-7912] to
support FULL backup/restore, and will complete the following function:
> {code:title=Backup Restore example|borderStyle=solid}
> /* backup from sourcecluster to targetcluster                                  */
> /* if no table name specified, all tables from source cluster will be backuped */
> [sourcecluster]$ hbase backup create full hdfs://hostname.targetcluster.org:9000/userid/backupdir
> /* restore on targetcluser, this is a local restore                                 
> /* backup_1396650096738 - backup image name                                         
> /* t1_dn,etc are the original table names. All tables will be restored if not specified
> /* t1_dn_restore, etc. are the restored table. if not specified, orginal table name will
be used*/
> [targetcluster]$ hbase restore /userid/backupdir backup_1396650096738 t1_dn,t2_dn,t3_dn
> /* restore from targetcluster back to source cluster, this is a remote restore
> [sourcecluster]$ hbase restore hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1396650096738 t1_dn,t2_dn,t3_dn t1_dn_restore,t2_dn_restore,t3_dn_restore
> {code}
> h2. Detail layout and frame work for the next jiras
> The patch is a wrapper of the existing snapshot and exportSnapshot, and will use as the
base framework for the over-all solution of  [HBase-7912|https://issues.apache.org/jira/browse/HBASE-7912]
as described below:
> * *bin/hbase*          : end-user command line interface to invoke BackupClient and RestoreClient
> * *BackupClient.java*  : 'main' entry for backup operations. This patch will only support
'full' backup. In future jiras, will support:
> ** *create* incremental backup
> ** *cancel* an ongoing backup
> ** *delete* an exisitng backup image
> ** *describe* the detailed informaiton of backup image
> ** show *history* of all successful backups 
> ** show the *status* of the latest backup request
> ** *convert* incremental backup WAL files into HFiles.  either on-the-fly during create
or after create
> ** *merge* backup image
> ** *stop* backup a table of existing backup image
> ** *show* tables of a backup image 
> * *BackupCommands.java* : a place to keep all the command usages and options
> * *BackupManager.java*  : handle backup requests on server-side, create BACKUP ZOOKEEPER
nodes to keep track backup. The timestamps kept in zookeeper will be used for future incremental
backup (not included in this jira). Create BackupContext and DispatchRequest. 
> * *BackupHandler.java*  : in this patch, it is a wrapper of snapshot and exportsnapshot.
In future jiras, 
> ** *timestamps* info will be recorded in ZK
> ** carry on *incremental* backup.  
> ** update backup *progress*
> ** set flags of *status*
> ** build up *backupManifest* file(in this jira only limited info for fullback. later
on, timestamps and dependency of multipl backup images are also recorded here)
> ** clean up after *failed* backup 
> ** clean up after *cancelled* backup
> ** allow on-the-fly *convert* during incremental backup 
> * *BackupContext.java* : encapsulate backup information like backup ID, table names,
directory info, phase, TimeStamps of backup progress, size of data, ancestor info, etc. 
> * *BackupCopier.java*  : the copying operation.  Later on, to support progress report
and mapper estimation; and extends DisCp for progress updating to ZK during backup. 
> * *BackupExcpetion.java*: to handle exception from backup/restore
> * *BackupManifest.java* : encapsulate all the backup image information. The manifest
info will be bundled as manifest file together with data. So that each backup image will contain
all the info needed for restore. 
> * *BackupStatus.java*   : encapsulate backup status at table level during backup progress
> * *BackupUtil.java*     : utility methods during backup process
> * *RestoreClient.java*  : 'main' entry for restore operations. This patch will only support
'full' backup. 
> * *RestoreUtil.java*    : utility methods during restore process
> * *ExportSnapshot.java* : remove 'final' so that another class SnapshotCopy.java can
extends from it
> * *SnapshotCopy.java*   : only a wrapper at this moment. But will be extended to keep
track progress(maybe should implemented in ExportSnapshot directly?)
> * *BackupRestoreConstants.java*     : add the constants used by backup/restore code.
> * *HBackupFilesystem.java*     :   the filesystem related api used by BackupClient and
> h2. Global log roll 
> currently a customized one under *org.apache.hadoop.hbase.backup.master* and *org.apache.hadoop.hbase.backup.regionserver*
> [HBASE-11148|https://issues.apache.org/jira/browse/HBASE-11148] is opened to provide
a general 'global log roll', and fullbackup code will be modified to use the general 'global
log roll' later once HBase-11148 is accepted by the community. 
> h2. Interface
> * currently, the code is under *hbase-sever* because it already contain a package name
called 'backup'. If move to *hbase-client*, the pom file has to be updated to include more
> * currently invoke through script bin/hbase as CLI interface. One advantage is easy to
embed into a linux sh script

This message was sent by Atlassian JIRA

View raw message