hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Chongxin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-50) Snapshot of table
Date Tue, 08 Jun 2010 15:28:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876699#action_12876699
] 

Li Chongxin commented on HBASE-50:
----------------------------------

bq. ... but also after snapshot is done.... your design should include description of how
files are archived, rather than deleted...

Are you talking about files that are no longer used by hbase table but are referenced by snapshot?
I think this has been described in chapter 6 'Snapshot Maintenance'. For example, hfiles are
archived in delete directory. And section 6.4 describes how these files will be cleaned up.

bq. ..In fact you'll probably be doing a snapshot of at least a subset of .META. on every
table snapshot I'd imagine - at least the entries for the relevant table.

.META. entries for the snapshot table have been dumped, haven't they? Why we still need a
snapshot of a subset of .META.?

bq. So, do you foresee your restore-from-snapshot running split over the logs as part of the
restore? That makes sense to me.

Yes, restore-from-snapshot has to run split over the WAL logs. It will take some time. So
restore-from-snapshot will not be very fast.

bq. Why you think we need a Reference to the hfile? Why not just a file that lists the names
of all the hfiles? We don't need to execute the snapshot, do we? Restoring from a snapshot
would be a bunch of file renames and wal splitting?

At first I thought snapshot probably should keep the table directory structure for the later
use. For example, a reader like HalfStoreFileReader could be provided so that we could read
from the snapshot directly. But yes, we actually don't execute the snapshot. So keeping a
list of all the hfiles (actually one list per RS, right?) should be enough. And also restroing
from snapshot is not just file renames. Since a hfile might be referenced by several snapshot,
we should probably do real copy when restroing, right?

bq. Shall we name the new .META. column family snapshot rather than reference?

sure

bq. On the filename '.deleted', I think it a mistake to give it a '.' prefix especially given
its in the snapshot dir...

Ok, I will rename the snapshot dir as '.snapshot'. For dir '.deleted', what name do you think
we should use? Because there might be several snapshots under the dir '.snapshot', each has
a snapshot name, I name this dir as '.deleted' to discriminate it from a snapshot name.

bq. Do you need a new catalog table called snapshots to keep list of snapshots, of what a
snapshot comprises and some other metadata such as when it was made, whether it succeeded,
who did it and why?

It'll be much more convenient if a catalog table 'snapshot' can be created. Will this impact
normal operation of hbase?

bq. Section 7.4 is missing split of WAL files. Perhaps this can be done in a MR job? 

I'll add the split of WAL logs. Yes, a MR job can be used. Which method do you think is better?
Read from the imported file and inserted into the table by hbase api. Or just copy the hfile
into place and update the .META.?

bq. Lets not have the master run the snapshot... let the client run it?
bq. Snapshot will be doing same thing whether table is partially online or not..

I put these two issues together because I think they are correlative. In current design, if
a table is opened, snapshot will be performed by each RS which serves tha table regions. Otherwise,
if a table is closed, snapshot will be performed by the master because the table is not served
by any RS. For the first comment, it is talking about closed table. So master will perform
the snapshot because client does not have access to underlying dfs. For the second one, I
was thinking if a table is partially online, table regions might be partially served by RS
and partially offline, right? Then who will perform the snapshot? If RS, the regions that
are offline will be missed. If the master, regions that are online might lose data in memstore.
I'm confused..

bq. It's a synchronous way. Do you think this is appropriate? Yes. I'm w/ JG on this.

This is another problem confusing me..In current design (which is a synchronous way), a snapshot
is started when all the RS are ready for snapshot. Then all RS perform snapshot concurrently.
This guarantees snapshot is not started if one RS fails. If we switch to an asynchronous approach.
Should the RS start snapshot immediately when it is ready?

> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, snapshot-src.zip
>
>
> Havening an option to take a snapshot of a table would be vary useful in production.
> What I would like to see this option do is do a merge of all the data into one or more
files stored in the same folder on the dfs. This way we could save data in case of a software
bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. Say I had
a read_only table that must be online. I could take a snapshot of it when needed and export
it to a separate data center and have it loaded there and then i would have it online at multi
data centers for load balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect from failed
servers, but this does not protect use from software bugs that might delete or alter data
in ways we did not plan. We should have a way we can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message