hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Eng <a...@maprtech.com>
Subject Re: Hadoop Java Versions
Date Thu, 30 Jun 2011 23:18:02 GMT
>Keeping the amount of disks per node low and the amount of nodes high
should keep the impact of dead nodes in control.

It keeps the impact of dead nodes in control but I don't think thats
long-term cost efficient.  As prices of 10GbE go down, the "keep the node
small" arguement seems less fitting.  And on another note, most servers
manufactured in the last 10 years have dual 1GbE network interfaces.  If one
were to go by these calcs:

>150 nodes with four 2TB disks each, with HDFS 60% full, it takes around ~32
minutes to recover

It seems like that assumes a single 1GbE interface, why  not leverage the
second?

On Thu, Jun 30, 2011 at 2:31 PM, Evert Lammerts <Evert.Lammerts@sara.nl>wrote:

> > You can get 12-24 TB in a server today, which means the loss of a server
> > generates a lot of traffic -which argues for 10 Gbe.
> >
> > But
> >   -big increase in switch cost, especially if you (CoI warning) go with
> > Cisco
> >   -there have been problems with things like BIOS PXE and lights out
> > management on 10 Gbe -probably due to the NICs being things the BIOS
> > wasn't expecting and off the mainboard. This should improve.
> >   -I don't know how well linux works with ether that fast (field reports
> > useful)
> >   -the big threat is still ToR switch failure, as that will trigger a
> > re-replication of every block in the rack.
>
> Keeping the amount of disks per node low and the amount of nodes high
> should keep the impact of dead nodes in control. A ToR switch failing is
> different - missing 30 nodes (~120TB) at once cannot be fixed by adding more
> nodes; that actually increases ToR switch failure. Although such failure is
> quite rare to begin with, I guess. The back-of-the-envelope-calculation I
> made suggests that ~150 (1U) nodes should be fine with 1Gb ethernet. (e.g.,
> when 6 nodes fail in a cluster with 150 nodes with four 2TB disks each, with
> HDFS 60% full, it takes around ~32 minutes to recover. 2 nodes failing
> should take around 640 seconds. Also see the attached spreadsheet.) This
> doesn't take ToR switch failure in account though. On the other hand - 150
> nodes is only ~5 racks - in such a scenario you might rather want to shut
> the system down completely rather than letting it replicate 20% of all data.
>
> Cheers,
> Evert

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message