hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Faris <afa...@linkedin.com>
Subject Re: services requiring topology conf
Date Fri, 11 Jan 2013 17:15:47 GMT
A patch was submitted for topology documentation, but it doesn't appear to have made it to
any releases.  This svn link may help starting at line 1294.
 
http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the
namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script
everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for
controlling compute nodes and pushing files, but be careful as if you type a bad command it's
going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam


On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bbeaudreault@hubspot.com> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and
while we have it working in our cluster I am trying to make it a little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster.  However
I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked
the code for DataNode and see no reference to this configuration parameter, but I wanted to
check with you all before I stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker,
or do they need to be on every node in a cluster?
> 
> Thanks


Mime
View raw message