hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Backing up HDFS?
Date Wed, 11 Feb 2009 10:44:08 GMT
Allen Wittenauer wrote:
> On 2/9/09 4:41 PM, "Amandeep Khurana" <amansk@gmail.com> wrote:
>> Why would you want to have another backup beyond HDFS? HDFS itself
>> replicates your data so if the reliability of the system shouldnt be a
>> concern (if at all it is)...
> 
> I'm reminded of a previous job where a site administrator refused to make
> tape backups (despite our continual harassment and pointing out that he was
> in violation of the contract) because he said RAID was "good enough".
> 
> Then the RAID controller failed. When we couldn't recover data "from the
> other mirror" he was fired.  Not sure how they ever recovered, esp.
> considering what the data was they lost.  Hopefully they had a paper trail.

hope that wasnt at SUNW, not given they do their own controllers

1. controller failure is lethal, especially if you don't notice for a while
2. some products -say, databases- didnt like live updates, so a trick 
evolved of taking off some of the RAID array and putting that to tape. 
Of course, then there's the problem of what happens there
3. Tape is still very power efficient; good for a bulk off-site store 
(or a local fire-safe)
4. Over at last.fm, they had an accident rm / on their primary dataset. 
Fortunately they did apparently have another copy somewhere else. and 
now that hfds has user ids, you can prevent anyone but the admin team 
from accidentally deleting everyones data.

Mime
View raw message