hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Hadoop Master and Slave Discovery
Date Tue, 05 Jul 2011 09:40:59 GMT
On 04/07/11 18:22, Ted Dunning wrote:
> One reasonable suggestion that I have heard recently was to do like Google
> does and put a DNS front end onto Zookeeper.  Machines would need to have
> DNS set up properly and a requests for a special ZK based domain would have
> to be delegated to the fancy DNS setup, but this would allow all kinds of
> host targeted configuration settings to be moved by a very standardized
> network protocol.  There are difficulties moving port numbers and all you
> really get are hostnames, but it is a nice trick since configuration of
> which nameserver to use is a common admin task.

good point

1. you could use DNS proper, by way of Bonjour/avahi. You don't need to 
be running any mDNS server to support .local, and I would strongly 
advise against it in a large cluster (because .local resolution puts a 
lot of CPU load on every server in the subnet). What you can do is have 
the DNS server register some .local entries and have the clients use 
this to bind. You probably also need to set the dns TTLs in the JVM. In 
a large clusters that'll just add to the DNS traffic, so it's where host 
tables start to look appealing

2. Apache MINA is set up to serve its directory data in lots of ways, 
including what appears to be text files over NFS. This is an even nicer 
trick. If you could get MINA to serve up the ZK data, life is very simple

> On Mon, Jul 4, 2011 at 3:44 AM, Steve Loughran<stevel@apache.org>  wrote:
>> On 03/07/11 03:11, Raja Nagendra Kumar wrote:
>>> Hi,
>>> Instead of depending on local syncup to configuration files, would it be a
>>> nice way to adopt JINI Discovery model, where in masters and slaves can
>>> discover each other dynamically through a UDP broadcast/heart beat methods
>> That assumes that UDP Broadcast is supported through the switches (many
>> turn it off as it creates too much traffic), or UDP multicast is supported
>> (as an example of an infrastructure that does not, play with EC2)
>>   This would mean, any machine can come up and say I am a slave and
>>> automatically discover the master and start supporting the master with in<
>>> x seconds.
>> How will your slave determine that the master that it has bonded to is the
>> master that it should bond to and not something malicious within the same
>> multicast range? It's possible, but you have generally have to have
>> configuration files in the worker nodes.
>> There's nothing to stop your Configuration Management layer using discovery
>> or central config servers (Zookeeper, Anubis, LDAP, DNS, ...), which then
>> pushes desired state information to the client nodes. These can deal with
>> auth and may support infrastructures that don't support broadcast or
>> multicast. Such tooling also gives you the ability to push out host table
>> config, JVM options, logging parameters, and bounce worker nodes into new
>> states without waiting for them to timeout and try to rediscover new
>> masters.

View raw message