hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <ma...@eecs.berkeley.edu>
Subject Re: Dynamic Cluster Node addition
Date Fri, 01 Jul 2011 04:05:34 GMT
You can have a new TaskTracker or DataNode join the cluster by just starting that daemon on
the slave (e.g. bin/hadoop-daemon.sh start tasktracker) and making sure it is configured to
connect to the right JobTracker or NameNode (through the mapred.job.tracker and fs.default.name
properties in the config files). The slaves file is only used for the bin/start-* and bin/stop-*
scripts, but Hadoop doesn't look at it at runtime. There may be other similar files that it
can look at though, such as a blacklist, but I think that in the default configuration you
can just launch the daemon and it will work.

Note that if you add a new DataNode, Hadoop won't automatically move old data to it (to spread
out the across the cluster) unless you run the HDFS rebalancer, at least as far as I know.


On Jun 30, 2011, at 8:56 PM, Paul Rimba wrote:

> Hey there,
> i am trying to add a new datanode/tasktracker to a currently running cluster.
> Is this feasible? And if yes, how do i change the masters, slaves and dfs.replication(in
hdfs-site.xml) configuration?
> can i add the new slave to the slaves configuration file while the cluster is running?
> i found this ./bin/hadoop dfs -setrep -w 4 /path/to/file command to change the dfs.replication
on the fly.
> Is there a better way to do it?
> Thank you for your kind attention.
> Kind Regards,
> Paul

View raw message