hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Demai Ni (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-11085) Incremental Backup Restore support
Date Tue, 27 May 2014 21:32:02 GMT

     [ https://issues.apache.org/jira/browse/HBASE-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Demai Ni updated HBASE-11085:
-----------------------------

    Description: 
h2. Feature Description
the jira is part of  [HBASE-7912|https://issues.apache.org/jira/browse/HBASE-7912], and depend
on full backup [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900]. for the detail
layout and frame work, please reference to  [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900].

When client issues an incremental backup request, BackupManager will check the request and
then kicks of a global procedure via HBaseAdmin for all the active regionServer to roll log.
Each Region server will record their log number into zookeeper. Then we determine which log
need to be included in this incremental backup, and use DistCp to copy them to target location.
At the same time, a dependency of backup image will be recorded, and later on saved in Backup
Manifest file.

Restore is to replay the backuped WAL logs on target HBase instance. The replay will occur
after full backup.

As incremental backup image depends on prior full backup image and incremental images if exists.
Manifest file will be used to store the dependency lineage during backup, and used during
restore time for PIT restore.  

h2. Use case(i.e  example)
{code:title=Incremental Backup Restore example|borderStyle=solid}
/*******************************************************************************************/
/* STEP1:  FULL backup from sourcecluster to targetcluster                  
/* if no table name specified, all tables from source cluster will be backuped 
/*******************************************************************************************/
[sourcecluster]$ hbase backup create full hdfs://hostname.targetcluster.org:9000/userid/backupdir
t1_dn,t2_dn,t3_dn
...
14/05/09 13:35:46 INFO backup.BackupManager: Backup request backup_1399667695966 has been
executed.
/*******************************************************************************************/
/* STEP2:   In HBase Shell, put a few rows                                               

/*******************************************************************************************/
hbase(main):002:0> put 't1_dn','row100','cf1:q1','value100_0509_increm1'
hbase(main):003:0> put 't2_dn','row100','cf1:q1','value100_0509_increm1'
hbase(main):004:0> put 't3_dn','row100','cf1:q1','value100_0509_increm1'

/*******************************************************************************************/
/* STEP3:   Take the 1st incremental backup                                            
/*******************************************************************************************/
[sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
...
14/05/09 13:37:45 INFO backup.BackupManager: Backup request backup_1399667851020 has been
executed.

/*******************************************************************************************/
/* STEP4:   In HBase Shell, put a few more rows.                                      
/*               update 'row100', and create new 'row101'                               
/*******************************************************************************************/
hbase(main):005:0> put 't3_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):006:0> put 't2_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):007:0> put 't1_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):009:0> put 't1_dn','row101','cf1:q1','value101_0509_increm2'
hbase(main):010:0> put 't2_dn','row101','cf1:q1','value101_0509_increm2'
hbase(main):011:0> put 't3_dn','row101','cf1:q1','value101_0509_increm2'

/*******************************************************************************************/
/* STEP5:   Take the 2nd incremental backup                                           
/*******************************************************************************************/
[sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
...
14/05/09 13:39:33 INFO backup.BackupManager: Backup request backup_1399667959165 has been
executed.

/*******************************************************************************************/
/* STEP7:   Restore from PIT of the 1st incremental backup                     
/* specified the backup ID of the 1st incremental                                      
/* option -automatic, will trigger the restore of full backup first, then 1st   
/* incremental backup image                                                              
     
/* t1_dn,etc are the original table names. All tables will be restored if not specified  
      
/* t1_dn_restore, etc. are the restored table. if not specified, orginal table name will be
used
/*******************************************************************************************/
[sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667851020 t1_dn,t2_dn,t3_dn t1_dn_restore1,t2_dn_restore1,t3_dn_restore1

/*******************************************************************************************/
/* STEP8:   Restore from PIT of the 2nd incremental backup                     
/* specified the backup ID of the 1st incremental                                       
/* option -automatic, will trigger the restore of full backup first, then 1st    
/* incremental backup image, and finally 2nd incremental backup image  
/*******************************************************************************************/
[sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667959165 t1_dn,t2_dn,t3_dn t1_dn_restore2,t2_dn_restore2,t3_dn_restore2
{code}

h2. Patch history
Since this jira depends on [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900],
for each version, two patches will be uploaded. One would be the real patch for this incremental-update
jira; another patch will contain the depended patch, so that  1) easy to review; 2) can by
applied by HadoopQA 

* Version 1  (https://reviews.apache.org/r/21492/)
** [HBASE-11085-trunk-v1.patch|https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]:
incremental update/restore code 
** [HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644215/HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch]:
contain both [HBASE-11085-trunk-v1.patch| https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]
and [HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644142/HBASE-10900-trunk-v4.patch]

* Version 2 
** 

  was:
h2. Feature Description
the jira is part of  [HBASE-7912|https://issues.apache.org/jira/browse/HBASE-7912], and depend
on full backup [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900]. for the detail
layout and frame work, please reference to  [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900].

When client issues an incremental backup request, BackupManager will check the request and
then kicks of a global procedure via HBaseAdmin for all the active regionServer to roll log.
Each Region server will record their log number into zookeeper. Then we determine which log
need to be included in this incremental backup, and use DistCp to copy them to target location.
At the same time, a dependency of backup image will be recorded, and later on saved in Backup
Manifest file.

Restore is to replay the backuped WAL logs on target HBase instance. The replay will occur
after full backup.

As incremental backup image depends on prior full backup image and incremental images if exists.
Manifest file will be used to store the dependency lineage during backup, and used during
restore time for PIT restore.  

h2. Use case(i.e  example)
{code:title=Incremental Backup Restore example|borderStyle=solid}
/*******************************************************************************************/
/* STEP1:  FULL backup from sourcecluster to targetcluster                  
/* if no table name specified, all tables from source cluster will be backuped 
/*******************************************************************************************/
[sourcecluster]$ hbase backup create full hdfs://hostname.targetcluster.org:9000/userid/backupdir
t1_dn,t2_dn,t3_dn
...
14/05/09 13:35:46 INFO backup.BackupManager: Backup request backup_1399667695966 has been
executed.
/*******************************************************************************************/
/* STEP2:   In HBase Shell, put a few rows                                               

/*******************************************************************************************/
hbase(main):002:0> put 't1_dn','row100','cf1:q1','value100_0509_increm1'
hbase(main):003:0> put 't2_dn','row100','cf1:q1','value100_0509_increm1'
hbase(main):004:0> put 't3_dn','row100','cf1:q1','value100_0509_increm1'

/*******************************************************************************************/
/* STEP3:   Take the 1st incremental backup                                            
/*******************************************************************************************/
[sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
...
14/05/09 13:37:45 INFO backup.BackupManager: Backup request backup_1399667851020 has been
executed.

/*******************************************************************************************/
/* STEP4:   In HBase Shell, put a few more rows.                                      
/*               update 'row100', and create new 'row101'                               
/*******************************************************************************************/
hbase(main):005:0> put 't3_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):006:0> put 't2_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):007:0> put 't1_dn','row100','cf1:q1','value101_0509_increm2'
hbase(main):009:0> put 't1_dn','row101','cf1:q1','value101_0509_increm2'
hbase(main):010:0> put 't2_dn','row101','cf1:q1','value101_0509_increm2'
hbase(main):011:0> put 't3_dn','row101','cf1:q1','value101_0509_increm2'

/*******************************************************************************************/
/* STEP5:   Take the 2nd incremental backup                                           
/*******************************************************************************************/
[sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
...
14/05/09 13:39:33 INFO backup.BackupManager: Backup request backup_1399667959165 has been
executed.

/*******************************************************************************************/
/* STEP7:   Restore from PIT of the 1st incremental backup                     
/* specified the backup ID of the 1st incremental                                      
/* option -automatic, will trigger the restore of full backup first, then 1st   
/* incremental backup image                                                              
     
/* t1_dn,etc are the original table names. All tables will be restored if not specified  
      
/* t1_dn_restore, etc. are the restored table. if not specified, orginal table name will be
used
/*******************************************************************************************/
[sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667851020 t1_dn,t2_dn,t3_dn t1_dn_restore1,t2_dn_restore1,t3_dn_restore1

/*******************************************************************************************/
/* STEP8:   Restore from PIT of the 2nd incremental backup                     
/* specified the backup ID of the 1st incremental                                       
/* option -automatic, will trigger the restore of full backup first, then 1st    
/* incremental backup image, and finally 2nd incremental backup image  
/*******************************************************************************************/
[sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667959165 t1_dn,t2_dn,t3_dn t1_dn_restore2,t2_dn_restore2,t3_dn_restore2
{code}

h2. Patch history
Since this jira depends on [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900],
for each version, two patches will be uploaded. One would be the real patch for this incremental-update
jira; another patch will contain the depended patch, so that  1) easy to review; 2) can by
applied by HadoopQA 

* Version 1 
** [HBASE-11085-trunk-v1.patch|https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]:
incremental update/restore code 
** [HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644215/HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch]:
contain both [HBASE-11085-trunk-v1.patch| https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]
and [HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644142/HBASE-10900-trunk-v4.patch]


> Incremental Backup Restore support
> ----------------------------------
>
>                 Key: HBASE-11085
>                 URL: https://issues.apache.org/jira/browse/HBASE-11085
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Demai Ni
>            Assignee: Demai Ni
>             Fix For: 0.99.0
>
>         Attachments: HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch, HBASE-11085-trunk-v1.patch,
HBASE-11085-trunk-v2-contain-HBASE-10900-trunk-v4.patch, HBASE-11085-trunk-v2.patch
>
>
> h2. Feature Description
> the jira is part of  [HBASE-7912|https://issues.apache.org/jira/browse/HBASE-7912], and
depend on full backup [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900]. for
the detail layout and frame work, please reference to  [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900].
> When client issues an incremental backup request, BackupManager will check the request
and then kicks of a global procedure via HBaseAdmin for all the active regionServer to roll
log. Each Region server will record their log number into zookeeper. Then we determine which
log need to be included in this incremental backup, and use DistCp to copy them to target
location. At the same time, a dependency of backup image will be recorded, and later on saved
in Backup Manifest file.
> Restore is to replay the backuped WAL logs on target HBase instance. The replay will
occur after full backup.
> As incremental backup image depends on prior full backup image and incremental images
if exists. Manifest file will be used to store the dependency lineage during backup, and used
during restore time for PIT restore.  
> h2. Use case(i.e  example)
> {code:title=Incremental Backup Restore example|borderStyle=solid}
> /*******************************************************************************************/
> /* STEP1:  FULL backup from sourcecluster to targetcluster                  
> /* if no table name specified, all tables from source cluster will be backuped 
> /*******************************************************************************************/
> [sourcecluster]$ hbase backup create full hdfs://hostname.targetcluster.org:9000/userid/backupdir
t1_dn,t2_dn,t3_dn
> ...
> 14/05/09 13:35:46 INFO backup.BackupManager: Backup request backup_1399667695966 has
been executed.
> /*******************************************************************************************/
> /* STEP2:   In HBase Shell, put a few rows                                          
     
> /*******************************************************************************************/
> hbase(main):002:0> put 't1_dn','row100','cf1:q1','value100_0509_increm1'
> hbase(main):003:0> put 't2_dn','row100','cf1:q1','value100_0509_increm1'
> hbase(main):004:0> put 't3_dn','row100','cf1:q1','value100_0509_increm1'
> /*******************************************************************************************/
> /* STEP3:   Take the 1st incremental backup                                         
  
> /*******************************************************************************************/
> [sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
> ...
> 14/05/09 13:37:45 INFO backup.BackupManager: Backup request backup_1399667851020 has
been executed.
> /*******************************************************************************************/
> /* STEP4:   In HBase Shell, put a few more rows.                                    
 
> /*               update 'row100', and create new 'row101'                           
   
> /*******************************************************************************************/
> hbase(main):005:0> put 't3_dn','row100','cf1:q1','value101_0509_increm2'
> hbase(main):006:0> put 't2_dn','row100','cf1:q1','value101_0509_increm2'
> hbase(main):007:0> put 't1_dn','row100','cf1:q1','value101_0509_increm2'
> hbase(main):009:0> put 't1_dn','row101','cf1:q1','value101_0509_increm2'
> hbase(main):010:0> put 't2_dn','row101','cf1:q1','value101_0509_increm2'
> hbase(main):011:0> put 't3_dn','row101','cf1:q1','value101_0509_increm2'
> /*******************************************************************************************/
> /* STEP5:   Take the 2nd incremental backup                                         
 
> /*******************************************************************************************/
> [sourcecluster]$ hbase backup create incremental hdfs://hostname.targetcluster.org:9000/userid/backupdir
> ...
> 14/05/09 13:39:33 INFO backup.BackupManager: Backup request backup_1399667959165 has
been executed.
> /*******************************************************************************************/
> /* STEP7:   Restore from PIT of the 1st incremental backup                     
> /* specified the backup ID of the 1st incremental                                   
  
> /* option -automatic, will trigger the restore of full backup first, then 1st   
> /* incremental backup image                                                         
          
> /* t1_dn,etc are the original table names. All tables will be restored if not specified
        
> /* t1_dn_restore, etc. are the restored table. if not specified, orginal table name will
be used
> /*******************************************************************************************/
> [sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667851020 t1_dn,t2_dn,t3_dn t1_dn_restore1,t2_dn_restore1,t3_dn_restore1
> /*******************************************************************************************/
> /* STEP8:   Restore from PIT of the 2nd incremental backup                     
> /* specified the backup ID of the 1st incremental                                   
   
> /* option -automatic, will trigger the restore of full backup first, then 1st    
> /* incremental backup image, and finally 2nd incremental backup image  
> /*******************************************************************************************/
> [sourcecluster]$ hbase restore -automatic hdfs://hostname.targetcluster.org:9000/userid/backupdir
backup_1399667959165 t1_dn,t2_dn,t3_dn t1_dn_restore2,t2_dn_restore2,t3_dn_restore2
> {code}
> h2. Patch history
> Since this jira depends on [HBASE-10900| https://issues.apache.org/jira/browse/HBASE-10900],
for each version, two patches will be uploaded. One would be the real patch for this incremental-update
jira; another patch will contain the depended patch, so that  1) easy to review; 2) can by
applied by HadoopQA 
> * Version 1  (https://reviews.apache.org/r/21492/)
> ** [HBASE-11085-trunk-v1.patch|https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]:
incremental update/restore code 
> ** [HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644215/HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch]:
contain both [HBASE-11085-trunk-v1.patch| https://issues.apache.org/jira/secure/attachment/12644214/HBASE-11085-trunk-v1.patch]
and [HBASE-10900-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12644142/HBASE-10900-trunk-v4.patch]
> * Version 2 
> ** 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message