hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: adding or restarting a data node in a hadoop cluster
Date Tue, 01 May 2012 03:19:25 GMT


On Tue, May 1, 2012 at 8:28 AM, sumadhur <sumadhur_iitr@yahoo.com> wrote:
> I am on hadoop 0.20.
> To add a data node to a cluster, if we do not use the include/exclude/slaves files, do
we need to  do anything other than configuring the hdfs-site.xml to point to name node and
the mapred-site.xml to point to job tracker?
> For example, should the job tracker and name node be restarted always?

Just booting up the DN service with the right config and a configured
network for proper communication should suffice.

In case you're using rack-awareness, ensure you update the
rack-awareness script for your new node and refresh the NN before you
start your DN.

A restart isn't required for adding new nodes to the cluster.

> On a related note, if we restart a data node(that has some blocks on it) and the data
node now has new IP address, Should we restart namenode/job tracker for hdfs and map-reduce
to function correctly?
> Would the blocks on the restarted data node be detected or would hdfs think that these
blocks were lost and start replicating them?

Stopping, changing the IP/Hostname cleanly and restarting the DN back
up should not cause any block movement.

Harsh J

View raw message