Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Message-ID: <1488436276.687131270477527202.JavaMail.jira@brutus.apache.org>
Date: Mon, 5 Apr 2010 14:25:27 +0000 (UTC)
From: "Li Chongxin (JIRA)" <jira@apache.org>
To: hbase-issues@hadoop.apache.org
Subject: [jira] Commented: (HBASE-50) Snapshot of table
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853374#action_12853374 ] 

Li Chongxin commented on HBASE-50:
----------------------------------

Hi folks,

I'd like to work on this issue as my GSOC project. Based on the comments i've got some initial ideas on this issue. Here is my proposal and any comments are welcome.

1. Snapshot is triggered by sending a signal across the cluster, probably distributed via zookeeper

2. On receipt, there are two options for region servers:
a). One option is to flush the memstore so that all writes to the table are persistentbefore the snapshot.
b). Another option is to roll the Write-Ahead-Log so that we can get the memstore content from the WAL. This is the same as Clint Morgan describes above. This might narrow snapshot window but extra work is needed to extract the memstore content. And if writeToWAL is set to false for the Put, data might be lost for the snapshot. (Maybe we could let the user chooses whether to use memstore or WAL)

3. Regionservers dump a manifest of all regions they are currently serving and the files of which these are comprised. The regionserver hosting the .META. would also dump the metadata for the table. The manifest as well as the metadata make up the snapshot information file. It can be stored in file <tablename>-time.info under the snapshot directory, say "snapshot.dir". 

4. Since hbase data files are immutable except compaction and split(right?), then the files listed in <tablename>-time.info keep the same until a compaction or split occur. So we can use some technique similar to copy-on-write. When compaction and split occur, old files will not be deleted but moved aside if it is contained in any <tablename>-time.info file under snapshot directory. At the same time, the snapshot information file should also be updated because the locations of these files have been changed.

5. For subsequent operations of the snapshot, e.g distcp or mapreduce job, first read the snapshot information file <tablename>-time.info, and then read the corresponding data files for the operation. 

6. This method can guarantee that snapshot window is narrow and snapshot request can be answered quickly because there is no actual copy.

7. If a snapshot is deleted, check the files listed in <tablename>-time.info. If a file is the moved old file and not used in other snapshot, it can be deleted from the file system. Finally delete the snapshot information file <tablename>-time.info.

Example: 
a) Table A is comprised of three regions: region1[a], region2[b,c], region3[d], (data files for the region are listed in brackets). Then snapshot of this table obtains the snapshot information file A-20100404.info which is comprised of {a, b, c, d, meta1}. 

b) Then some more writes are performed to table A. The three regions become region1[a,e], region2[b,c], region3[d]. Another snapshot of table A will result in A-20100405.info of {a, b, c, d, e, meta2}. 

c) Compaction is performed for region1 and region2 so that we have region1[f], region2[g], region3[d] for table A. Since file a,b,c,e are used in previous snapshot information file. These files will be moved aside as a.old, b.old, c.old, e.old. Accordingly A-20100404.info will be updated as {a.old, b.old, c.old, d, meta} and A-2010-0405.info will be updated as {a.old, b.old, c.old, d, e.old, meta2}

d) At this time, snapshot of table A will result A-20100406.info {f, g, d, meta3}


> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Priority: Minor
>
> Havening an option to take a snapshot of a table would be vary useful in production.
> What I would like to see this option do is do a merge of all the data into one or more files stored in the same folder on the dfs. This way we could save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. Say I had a read_only table that must be online. I could take a snapshot of it when needed and export it to a separate data center and have it loaded there and then i would have it online at multi data centers for load balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect from failed servers, but this does not protect use from software bugs that might delete or alter data in ways we did not plan. We should have a way we can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.