hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Zeroconf for hadoop
Date Mon, 26 Jan 2009 18:35:51 GMT
Owen O'Malley wrote:
> allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker
> 
> and it will use ssh in parallel to connect to every node between 
> node1000 and node3000. Our's is a mess, but it would be great if someone 
> contributed a script like that. *smile*

It would be a one-line change to bin/slaves.sh to have it filter hosts 
by a regex.

Note that bin/slaves.sh can have problems with larger clusters (>~100 
nodes) since a single shell has trouble handling the i/o from 100 
sub-processes, and ssh connections will start timing out.  That's the 
point of the HADOOP_SLAVE_SLEEP parameter, to meter the rate that 
sub-processes are spawned.  A better solution might be too sleep if the 
number of sub-processes exceeds some limit, e.g.:

   while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done

Doug

Mime
View raw message