hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Chongxin (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-50) Snapshot of table
Date Sun, 13 Jun 2010 02:13:18 GMT

     [ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Li Chongxin updated HBASE-50:

    Attachment: HBase Snapshot Design Report V3.pdf

Design document has been updated based on the discussion. Following changes have been made:

* Requirements have been updated

* Snapshot can now be created for both online (enabled) tables and offline (disabled) tables.
For offline table, snapshot is performed by the master

* Metadata for the table is not copied from .regioninfo any more but totally dumped from .META.

* WAL logs are now archived instead of deleted, so snapshot does not copy the log files any
more but take a file that lists the log names. A new section 6.5 is added on log maintenance

* Rename 'reference' family in .META. to 'snapshot'

* Add the same column family 'snapshot' to -ROOT- so that .META. can be snapshot too

* A new file .snapshotinfo is created under each snapshot dir to keep the meta information
of snapshot. List operation for snapshots will read the this meta file.

* A new operation 'Restore' is added to restore a table from a snapshot on the same data center

* Export and import are changed. Export and import are used to export a snapshot to or imort
a snapshot from other data centers. Therefore, exported snapshot has the same file format
as how a table is exported so that we can treat exported snapshot the same as exported table
and import the exported snapshot with the same import facility.

Pending Questions:
What if the table with the same name is still online when we want to restore a snapshot? There
will be a name collision in both HDFS and .META. ; We should not touch the existing table,
Then shall we allow rename the snapshot as a new table name? For example the snapshot is created
for table "table1", can we restore the snapshot as "table2"?

> Snapshot of table
> -----------------
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot Design Report
V3.pdf, snapshot-src.zip
> Havening an option to take a snapshot of a table would be vary useful in production.
> What I would like to see this option do is do a merge of all the data into one or more
files stored in the same folder on the dfs. This way we could save data in case of a software
bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. Say I had
a read_only table that must be online. I could take a snapshot of it when needed and export
it to a separate data center and have it loaded there and then i would have it online at multi
data centers for load balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect from failed
servers, but this does not protect use from software bugs that might delete or alter data
in ways we did not plan. We should have a way we can roll back a dataset.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message