accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <mikewest...@zapatatechnology.com>
Subject RE: Backup and Recovery
Date Tue, 03 Oct 2017 22:24:55 GMT
What’s the name of the utility?

 

From: Christopher [mailto:ctubbsii@apache.org] 
Sent: Tuesday, October 3, 2017 2:01 PM
To: user@accumulo.apache.org
Subject: Re: Backup and Recovery

 

Oh, sorry, no. That's not the case. I did not mean to mislead. You also need to back up the
metadata from ZooKeeper for a complete backup. We have a utility for that, which I believe
is mentioned in the documentation. If not, that's a documentation bug and we should add it.
(Sorry, unable to check at the moment, but please file a bug if you can't find it.)

On Tue, Oct 3, 2017 at 4:47 PM <mikewestman@zapatatechnology.com <mailto:mikewestman@zapatatechnology.com>
> wrote:

So if I backup the HDFS I have a backup of accumulo? There isn’t any other data that I’d
need to grab?

 

From: Christopher [mailto:ctubbsii@apache.org <mailto:ctubbsii@apache.org> ] 
Sent: Tuesday, October 3, 2017 1:41 PM
To: user@accumulo.apache.org <mailto:user@accumulo.apache.org> 
Subject: Re: Backup and Recovery

 

Hi Mike. This is a great question. Accumulo has several options for backup.

Accumulo is backed by HDFS for persisting its data on disk. It may be possible to use S3 directly
at this layer. I'm not sure what the current state is for doing something like this, but a
brief Googling for "HDFS on S3" shows a few historical projects which may still be active
and mature.

Accumulo also has a replication feature to automatically mirror live ingest to a pluggable
external receiver, which could be a backup service you've written to store data in S3. Recovery
would depend on how you store the data in S3. You could also implement an ingest system which
stores data to a backup as well as to Accumulo, to handle both live and bulk ingest.

Accumulo also has an "exporttable" feature, which exports the metadata for a table, along
with a list of files in HDFS for you to back up to S3 (or another file system). Recovery involves
using the "importtable" feature which recreates the metadata, and bulk importing the files
after you've moved them from your backup location back onto HDFS.

This is just a rough outline of 3 possible solutions. I don't know which (if any) would match
your requirements best. There may be many other solutions as well.

On Tue, Oct 3, 2017 at 4:10 PM <mikewestman@zapatatechnology.com <mailto:mikewestman@zapatatechnology.com>
> wrote:

Please forgive the newbie question. What options are there for backup and recovery of accumulo
data?

 

Ideally I would like something that would replicate to S3 in realtime.

 


-- 


Zapata Technology your *8(a)* and *HUBZone *IT Solutions Provider
Washington Technology and Inc Magazine fastest growing company two years in 
a row. 

þ Please consider our environment before printing this e-mail.

*CONFIDENTIALITY NOTE:*  This communication is intended solely to be used 
by the intended recipient only and may contain information that is 
privileged, confidential, or otherwise prohibited by law from disclosure. 
 If you are not the intended recipient, you are hereby notified that any 
dissemination, distribution, copying, taking any action in reliance upon, 
or other use of this information is strictly prohibited.  If you received 
this communication in error, please contact the sender and then delete it. 
 Thank you.  

Mime
View raw message