hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evert Lammerts <Evert.Lamme...@sara.nl>
Subject RE: Hadoop Java Versions
Date Thu, 30 Jun 2011 21:31:26 GMT
> You can get 12-24 TB in a server today, which means the loss of a server
> generates a lot of traffic -which argues for 10 Gbe.
> 
> But
>   -big increase in switch cost, especially if you (CoI warning) go with
> Cisco
>   -there have been problems with things like BIOS PXE and lights out
> management on 10 Gbe -probably due to the NICs being things the BIOS
> wasn't expecting and off the mainboard. This should improve.
>   -I don't know how well linux works with ether that fast (field reports
> useful)
>   -the big threat is still ToR switch failure, as that will trigger a
> re-replication of every block in the rack.

Keeping the amount of disks per node low and the amount of nodes high should keep the impact
of dead nodes in control. A ToR switch failing is different - missing 30 nodes (~120TB) at
once cannot be fixed by adding more nodes; that actually increases ToR switch failure. Although
such failure is quite rare to begin with, I guess. The back-of-the-envelope-calculation I
made suggests that ~150 (1U) nodes should be fine with 1Gb ethernet. (e.g., when 6 nodes fail
in a cluster with 150 nodes with four 2TB disks each, with HDFS 60% full, it takes around
~32 minutes to recover. 2 nodes failing should take around 640 seconds. Also see the attached
spreadsheet.) This doesn't take ToR switch failure in account though. On the other hand -
150 nodes is only ~5 racks - in such a scenario you might rather want to shut the system down
completely rather than letting it replicate 20% of all data.

Cheers,
Evert
Mime
View raw message