hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: Hadoop Java Versions
Date Fri, 01 Jul 2011 01:12:41 GMT
Good point Todd.

I was speaking from the experience of people I know who are using 0.20.x

On Thu, Jun 30, 2011 at 5:24 PM, Todd Lipcon <todd@cloudera.com> wrote:

> On Thu, Jun 30, 2011 at 5:16 PM, Ted Dunning <tdunning@maprtech.com>
> wrote:
>
> > You have to consider the long-term reliability as well.
> >
> > Losing an entire set of 10 or 12 disks at once makes the overall
> > reliability
> > of a large cluster very suspect.  This is because it becomes entirely too
> > likely that two additional drives will fail before the data on the
> off-line
> > node can be replicated.  For 100 nodes, that can decrease the average
> time
> > to data loss down to less than a year.  This can only be mitigated in
> stock
> > hadoop by keeping the number of drives relatively low.  MapR avoids this
> by
> > not failing nodes for trivial problems.
> >
>
> I'd advise you to look at "stock hadoop" again. This used to be true, but
> was fixed a long while back by HDFS-457 and several followup JIRAs.
>
> If MapR does something fancier, I'm sure we'd be interested to hear about
> it
> so we can compare the approaches.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message