hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Backing up HDFS
Date Tue, 03 Aug 2010 15:46:16 GMT

> Date: Tue, 3 Aug 2010 11:02:48 -0400
> Subject: Re: Backing up HDFS
> From: edlinuxguru@gmail.com
> To: common-user@hadoop.apache.org

> Assuming you are taking the distcp approach you can mirror your
> cluster with some scripting/coding. However your destination systems
> can be more modest, assuming you wish to use it ONLY for data no job
> processing:

And that would be a waste. (Why build a cloud just to store data and not do any processing?)

You're not building your cloud in a vacuum. There are going to be SAN(s), other servers, tape???
available. The trick is getting the important data off the cloud to a place where it can be
backed up via the corporation's standard IT practices.

Because of the size of data, you may see people pulling data off the cloud in to a SAN, then
to either a tape drive or a SATA Hot Swap Drive for off site storage.
It all depends on the value of the data. 

Again, YMMV



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message