hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Srinivas <sur...@hortonworks.com>
Subject Re: Using distcp with Hadoop HA
Date Tue, 29 Jan 2013 23:03:54 GMT
Currently, as you have pointed out, client side configuration based
failover is used in HA setup. The configuration must define namenode
addresses  for the nameservices of both the clusters. Are the datanodes
belonging to the two clusters running on the same set of nodes? Can you
share the configuration you are using, to diagnose the problem?

- I am trying to do a distcp from cluster A to cluster B. Since no
> operations are supported on the standby namenode, I need to specify either
> the active namenode while using distcp or use the failover proxy provider
> (dfs.client.failover.proxy.provider.clusterA) where I can specify the two
> namenodes for cluster B and the failover code inside HDFS will figure it
> out.

> - If I use the failover proxy provider, some of my datanodes on cluster A
> would connect to the namenode on cluster B and vice versa. I am assuming
> that is because I have configured both nameservices in my hdfs-site.xml for
> distcp to work.. I have configured dfs.nameservice.id to be the right one
> but the datanodes do not seem to respect that.
> What is the best way to use distcp with Hadoop HA configuration without
> having the datanodes to connect to the remote namenode? Thanks
> Regards,
> Dhaval


View raw message