accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Backup Strategies
Date Fri, 31 May 2013 19:29:26 GMT
On Fri, May 31, 2013 at 2:39 PM, Billie Rinaldi <>wrote:

> I'm not sure copying data out of HDFS is what you would want to do, though
> I suppose it depends on how much data you're storing there.  If you want a
> backup on a different system, but you have too much data to store outside
> of a distributed file system, you could consider using distcp to copy from
> one HDFS instance to another.
> You can't clone the !METADATA table.  In 1.5.0, you can export and import
> tables, which is designed to help you copy a table to a different cluster
> (see docs/examples/README.export).  Cloning your tables could help, but in
> the case of !METADATA corruption you're still in the position of manually
> creating a new table with the same configuration (and split points if you
> know them) and bulk importing the old data files.  I don't know if table
> export could be used to back up the metadata and configuration of a cloned
> table to help you recover its state later on the same system if the
> original table has gotten corrupted.  It's possible.

Export table will save the tables state (whats in !METADATA in zookeeper)
to a zipfile.  So even if you do not actually copy the exported table, it
can be used to save table metadata.   I made comment on ACCUMULO-942 about
using export table to obtain a consistent snapshot of HDFS and Accumulo
metadata using export table.  That system metadata could be backed up.

> Billie
> On Fri, May 31, 2013 at 11:05 AM, Mike Hugo <> wrote:
>> I'm curious to know how people are backing up data in Accumulo.
>> We are planning on copying data out of HDFS on a some regular basis to be
>> able to do full restore.
>> We've also ended up getting into a state of having a corrupt !METADATA
>> table a few times.  I'm wondering if doing a clone on a few tables on a
>> periodic basis (like every hour, for a few hours) might be one way to help
>> us recover from that situation.
>> E.g if we did a clone on all tables, including the !METADATA table
>> hourly, and we didn't necessarily care about losing data in the last hour
>> time frame, could we simply restore from one of those clones if we get into
>> a corrupted state?
>> Is there another mechanism for snapshotting / backing up data in Accumulo?
>> Thanks for your thoughts!
>> Mike

View raw message