hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: HDFS replica management
Date Tue, 17 Jul 2007 15:41:00 GMT


Assuming that you have many more disks than 3, then the chances that 3
simultaneous disk failures being just the right 3 is much lower than the
chances of losing any 3 disks.  This is enhanced by the ability of Hadoop to
allocate files in different racks since one of the few mechanisms of
coordinating failures is losing an entire rack.

For example, if you have 20 disks, then the chance of losing a particular
three disks given that you are losing 3 disks is about one chance in a
thousand (assuming independent error location) and should be impossible if
the failures are rack aligned.

Remember, you can always increase the number of replicas if you like.


On 7/17/07 12:55 AM, "Phantom" <ghostwhoowalks@gmail.com> wrote:

> Is replica management built into HDFS ? What I mean is if I set replication
> factor to 3 and if I lose 3 disks is that data lost forever ? I mean all 3
> disks dying at the same time I know is a far fetched scenario but if they
> die over a certain period of time does HDFS re-replicate the data to ensure
> that there are always 3 copies in the system ?
> 
> Thanks
> A


Mime
View raw message