hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ricky Ho <...@adobe.com>
Subject Highly dynamic Hadoop Cluster
Date Wed, 26 Nov 2008 16:10:03 GMT
Does Hadoop support the environment where nodes join and leave without a preconfigured file
like "hadoop-site.xml" ?  The characteristic is that none of the IP addresses and node names
of any machines are stable.  They will change after the machine is reboot after crash.

Before that, I use a simple way of just configuring my hadoop-site.xml and use the startup
scripts that takes care of everything.  But for the dynamic IP address scenario, that doesn't
seem to work.  Can someone suggest a solution how to deal with this scenario ?

Here are the considerations ...

Startup Discovery Scenario
How does a NameNode knows a newly joined DataNode ?
How does a new DataNode knows the existing NameNode ?
How does a JobTracker knows a newly joined TaskTracker ?
How does a new TaskTracker knows the existing JobTracker ?

Fail Recovery Scenario
Lets say a NameNode crash, and then another NameNode (at a different address) starts up. 
How does the new NameNode learnt about other DataNodes ?
How does other DataNodes learn about this new NameNode ?

Lets say a JobTracker crash, and then another JobTracker (at a different address) starts up.
 How does the new JobTracker learnt about other TaskTrackers ?
How does other TaskTrackers learn about this new JobTracker ?

Lets say a DataNode crash, and then another DataNode (at a different address) starts up. 
How does the new DataNode learnt about the existing NameNode ?
How does the existing NameNode learn about this new DataNode ?

Lets say a TaskTracker crash, and then another TaskTracker (at a different address) starts
up.  How does the new TaskTracker learnt about the existing JobTracker ?
How does the existing JobTracker learn about this new TaskTracker ?


View raw message