hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Graham <billgra...@gmail.com>
Subject Re: Changing hostnames of tasktracker/datanode nodes - any problems?
Date Tue, 10 Aug 2010 17:54:24 GMT
Sorry to hijack the thread but I have a similar use case.

In a few months we're going to be moving colos. The new cluster will be the
same size as the current cluster and some downtime is acceptable. The
hostnames will be different. From what I've read in this thread it seems
like it would be safe to do the following:

1. Build out the new cluster without starting it.
2. Shut down the entire old cluster (NN, SNN, DNs)
3. scp the relevant data and name dirs for each host to the new hardware.
4. Start the new cluster

Is is correct to say that that would work fine? We have a replication factor
of 2, so we'd be copying twice as much data as we'd need to so I'm sure
there's a more efficient approach.

What about adding the new nodes in the new colo to the existing cluster,
rebalancing and then decommissioning the old cluster nodes before finally
migrating the NN/SNN? I know Hadoop isn't intended to run cross-colo, but
would this be a more efficient approach than the one above?

On Tue, Aug 10, 2010 at 8:59 AM, Allen Wittenauer

> On Aug 10, 2010, at 7:07 AM, Brian Bockelman wrote:
> > Hi Erik,
> >
> > You can also do this one-by-one (aka, a rolling reboot).  Shut it down,
> wait for it to be recognized as dead, then bring it back up with a new
> hostname.  It will take a much longer time, but you won't have any decrease
> in availability, just some minor decrease in capacity.
> ... and potentially problems with dfs.hosts.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message