hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-50) Snapshot of table
Date Thu, 12 Aug 2010 07:58:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897619#action_12897619
] 

HBase Review Board commented on HBASE-50:
-----------------------------------------

Message from: "Chongxin Li" <lichongxin@zju.edu.cn>


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/SnapshotMonitor.java, line 166
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6019#file6019line166>
bq.  >
bq.  >     Want to remove this or enable the assertion?  One or the other I'd say rather
than this.

remove it


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/SnapshotTracker.java, line 1
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6021#file6021line1>
bq.  >
bq.  >     Its a pity this class is named so.  We're about to bring in a new patch that
redoes the zk stuff -- breaks it up into pieces each with a singular purpose; e.g. tracking
root location or tracking meta region server -- and unfortunately the pattern is to name these
purposed classes *Tracker.  There'll be a clash of this kinda Tracker and the new zk Trackers.
 Not important, just saying in case you have another name in mind for this class.

I'll think about it. Any suggestion?


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2288
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6024#file6024line2288>
bq.  >
bq.  >     And flushing is disabled at this point too, right?  Compactions? (Good).

yes, flushing and compaction are disabled when snapshot.


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 944
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6027#file6027line944>
bq.  >
bq.  >     Do we have to do this down at the Store level?  Coud we move it up to Region
or up to the RegionServer itself?  It already has an HTable instance.

This method is only used to delete old store files after compaction, is it appropriate to
move it to Region?


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/master/TestSnapshot.java, line 382
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6037#file6037line382>
bq.  >
bq.  >     What about a test of restore from snapshot?  Is there one?  I dont' see it?

It's already in TestAdmin


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/FSUtils.java, line 713
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6032#file6032line713>
bq.  >
bq.  >     Does this stuff belong in here in this general utility class?  Should it be
polluted with References?  Should this stuff be over in io package where the Reference is
or static methods on Reference?

OK, I'll move it to Reference


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 267
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6028#file6028line267>
bq.  >
bq.  >     Why you have to pass the reference?  It wasn't needed previously?

Previously there is only one type of reference file, i.e. reference after split. But right
now there are another type of reference file for snapshot. We need to know the reference type
to get the referred to file. 

This is used for table restored from snapshot.


bq.  On 2010-08-11 11:32:27, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 2355
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6024#file6024line2355>
bq.  >
bq.  >     If snapshot fails, do we have to do cleanup?

HRegions just quit the snapshot mode if fails. The master would be notified with failure and
do the clean up work for the whole snapshot.


- Chongxin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/467/#review840
-----------------------------------------------------------





> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot Design Report
V3.pdf, HBase Snapshot Implementation Plan.pdf, Snapshot Class Diagram.png
>
>
> Havening an option to take a snapshot of a table would be vary useful in production.
> What I would like to see this option do is do a merge of all the data into one or more
files stored in the same folder on the dfs. This way we could save data in case of a software
bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. Say I had
a read_only table that must be online. I could take a snapshot of it when needed and export
it to a separate data center and have it loaded there and then i would have it online at multi
data centers for load balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect from failed
servers, but this does not protect use from software bugs that might delete or alter data
in ways we did not plan. We should have a way we can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message