hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Backing up HDFS
Date Tue, 03 Aug 2010 14:40:41 GMT


Here's quick and dirty solution that works.
I'm assuming that your cloud is part of a larger corporate network and that you have your
cloud, and then 'cloud aware machines', machines that have hadoop installed, but are not part
of your cloud but are where you launch jobs and applications from... These machines also have
file system mounts to SANs or other network attached (fiber channel attached) storage.

Step 1 make a copy of the files that you want to backup in to a separate directory on HDFS
Step 2 from a 'cloud aware machine' that has SAN disk... 
     use the hadoop fs -copyToLocal <file name>(s)  where local disk is on the SAN

Now let your normal backup policy take over. (Assuming that you have a policy for backing
up data stored on the SAN)

I saw Eric's post about a second Cloud. Not always possible and not always a good idea if
all you want to do is to back up data sets for remote storage.

Note the following:
Performance will vary based on the number of data sets and sizes of the data sets you want
to store.



> Date: Tue, 3 Aug 2010 06:54:41 -0700
> From: dan.paulus@bronto.com
> To: core-user@hadoop.apache.org
> Subject: Backing up HDFS
> So I am administering a 10+ node hadoop cluster and everything is going
> swimmingly.  Unfortunately, some relatively critical data is now being
> stored on the cluster and I am being asked to create a backup solution for
> hadoop in case of catasrophic failure of the data center, the application
> creating data corruption, and ultimately my company wants that warm fuzzy
> feeling that only an offsite backup can provide.
> So does anyone else actually backup HDFS?  After a quick google and forum
> search I found the following link that creates a full backup and then
> incremental backups, anyone use this or something similar?
> http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/
> http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/ 
> Thanks in advance.
> -- 
> View this message in context: http://old.nabble.com/Backing-up-HDFS-tp29335698p29335698.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message