hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Hadoop Master and Slave Discovery
Date Thu, 07 Jul 2011 00:49:37 GMT

On Jul 6, 2011, at 5:05 PM, Eric Yang wrote:

> Did you know that almost all linux desktop system comes with avahi
> pre-installed and turn on by default?

	... which is why most admins turn those services off by default. :)

>  What is more interesting is
> that there are thousands of those machines broadcast in large
> cooperation without anyone noticing them? 

	That's because many network teams turn off multicast past the subnet boundary and many corporate
desktops are in class C subnets.  This automatically limits the host count down to 200-ish
per network.  Usually just the unicast traffic is bad enough.  Throwing multicast into the
mix just makes it worse.

> I have recently built a
> multicast dns browser and look into the number of machines running in
> a large company environment.  The number of desktop, laptop and
> printer machines running multicast dns is far exceeding 1000 machines
> in the local subnet.

	From my understanding of Y!'s network, the few /22's they have (which would get you 1022
potential hosts on a subnet) have multicast traffic dropped at the router and switch levels.
 Additionally, DNS-SD (the service discovery portion of mDNS) offers unicast support as well.
 So there is a very good chance that the traffic you are seeing is from unicast, not multicast.

	The 1000 number, BTW, comes from Apple.  I'm sure they'd be interested in your findings given
their role in ZC.  

	BTW, I'd much rather hear that you set up a /22 with many many machines running VMs trying
to actually use mDNS for something useful.  A service browser really isn't that interesting.

> They are all happily working fine without causing any issues.

	... that you know of.  Again, I'm 99% certain that Y! is dropping multicast packets into
the bit bucket at the switch boundaries.  [I remember having this conversation with them when
we setup the new data centers.]

>  Printer works fine,

	Most admins turn SLP and other broadcast services on printers off.   For large networks,
one usually sees print services enabled via AD or master print servers broadcasting the information
on the local subnet.  This allows a central point of control rather than randomness.   Snow
Leopard (I don't think Leopard did this) actually tells you where the printer is coming from
now, so that's handy to see if they are ZC or AD or whatever.

> itune sharing from someone
> else works fine.

	iTunes specifically limits its reach so that it can't extend beyond the local subnet and
definitely does unicast in addition to ZC, so that doesn't really say much of anything, other
than potentially invalidating your results.

>  For some reason, things tend to work better on my
> side of universe. :)  

	I'm sure it does, but not for the reasons you think they do.

> Allen, if you want to get stuck on stone age
> tools, I won't stop you.
	Multicast has a time and place (mainly for small, non-busy networks).  Using it without understanding
the network impact is never a good idea.

	FWIW, I've seen multicast traffic bring down an entire campus of tens of thousands of machines
due to routers and switches having bugs where they didn't subtract from the packet's TTL.
 I'm not the only one with these types of experiences.  Anything multicast is going to have
a very large uphill battle for adoption because of these widespread problems.  Many network
vendors really don't get this one right, for some reason.
View raw message