hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9868) add reading source cluster with HA access mode feature for DistCp
Date Thu, 03 Mar 2016 17:11:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178152#comment-15178152
] 

Wei-Chiu Chuang commented on HDFS-9868:
---------------------------------------

Hi, Thanks for the contribution.
Do you have a test case that switches off the active node to simulate the scenario?

> add reading source cluster with HA access mode feature for DistCp
> -----------------------------------------------------------------
>
>                 Key: HDFS-9868
>                 URL: https://issues.apache.org/jira/browse/HDFS-9868
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: 2.7.1
>            Reporter: NING DING
>            Assignee: NING DING
>         Attachments: HDFS-9868.1.patch, HDFS-9868.2.patch
>
>
> Normally the HDFS cluster is HA enabled. It could take a long time when coping huge data
by distp. If the source cluster changes active namenode, the distp will run failed. This patch
supports the DistCp can read source cluster files in HA access mode. A source cluster configuration
file needs to be specified (via the -sourceClusterConf option).
>   The following is an example of the contents of a source cluster configuration
>   file:
> {code:xml}
>     <configuration>
>       <property>
> 		<name>fs.defaultFS</name>
> 		<value>hdfs://mycluster</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.nameservices</name>
> 		<value>mycluster</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.ha.namenodes.mycluster</name>
> 		<value>nn1,nn2</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.namenode.rpc-address.mycluster.nn1</name>
> 		<value>host1:9000</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.namenode.rpc-address.mycluster.nn2</name>
> 		<value>host2:9000</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.namenode.http-address.mycluster.nn1</name>
> 		<value>host1:50070</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.namenode.http-address.mycluster.nn2</name>
> 		<value>host2:50070</value>
> 	  </property>
> 	  <property>
> 		<name>dfs.client.failover.proxy.provider.mycluster</name>
> 		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> 	  </property>
> 	</configuration>
> {code}
>   The invocation of DistCp is as below:
> {code}
>     bash$ hadoop distcp -sourceClusterConf sourceCluster.xml /foo/bar hdfs://nn2:8020/bar/foo
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message