hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Changing hostnames of tasktracker/datanode nodes - any problems?
Date Tue, 10 Aug 2010 14:07:25 GMT
Hi Erik,

You can also do this one-by-one (aka, a rolling reboot).  Shut it down, wait for it to be
recognized as dead, then bring it back up with a new hostname.  It will take a much longer
time, but you won't have any decrease in availability, just some minor decrease in capacity.

This is useful in sites like ours where we have 24/7 usage and try to avoid any unnecessary
downtime.

Brian

On Aug 10, 2010, at 8:42 AM, Allen Wittenauer wrote:

> 
> On Aug 10, 2010, at 3:51 AM, Erik Forsberg wrote:
> 
>> Hi!
>> 
>> Due to network reconfigurations, I need to change the hostnames of some
>> of my worker nodes, i.e. the nodes running tasktracker and datanode. I
>> need to do this to make my hostname naming schema actually concur with
>> the network setup.
>> 
>> Are there any problems doing this? It seems HDFS identifies things
>> using some kind of hostname-independent UID, so I guess it should
>> recover from one machine going down and then appear again under another
>> hostname?
> 
> 
> The name node metadata only matches blocks #s to files.  The data node to block part
is determined at run time.  So it is perfectly safe and legal to:
> 
> a) bring grid down
> b) rename everything
> c) fix config files
> d) bring grid up
> 
> Don't forget to change your network topology script, slaves, and dfs.hosts as necessary.
 If fsck complains about topology violations, use setrep to increase then setrep to decrease
the violated files.


Mime
View raw message