hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Lu <...@vicaya.com>
Subject Re: Dedicated disk for operating system
Date Wed, 10 Aug 2011 19:19:18 GMT
On Wed, Aug 10, 2011 at 10:40 AM, Ted Dunning <tdunning@maprtech.com> wrote:
> To be specific, taking a 100 node x 10 disk x 2 TB configuration with drive
> MTBF of 1000 days, we should be seeing drive failures on average once per
> day....
> For a 10,000 node cluster, however, we should expect the average rate of
> disk failure rate of one failure every 2.5 hours.

Do you have real data to back the analysis? You assume a uniform disk
failure distribution, which is absolutely not true. I can only say
that our ops data across 40000+ nodes shows that the above analysis is
not even close. (This is assuming that the ops know what they are
doing though :)


View raw message