Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 90AAC200C32 for ; Thu, 9 Mar 2017 17:16:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8F1BD160B67; Thu, 9 Mar 2017 16:16:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D6D0F160B64 for ; Thu, 9 Mar 2017 17:16:46 +0100 (CET) Received: (qmail 25724 invoked by uid 500); 9 Mar 2017 16:16:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 25713 invoked by uid 99); 9 Mar 2017 16:16:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Mar 2017 16:16:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 63298C0370 for ; Thu, 9 Mar 2017 16:16:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.652 X-Spam-Level: X-Spam-Status: No, score=0.652 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id zA9fwcv84tZd for ; Thu, 9 Mar 2017 16:16:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id C8D675FBE4 for ; Thu, 9 Mar 2017 16:16:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id C3494E09F0 for ; Thu, 9 Mar 2017 16:16:38 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 3311F243BC for ; Thu, 9 Mar 2017 16:16:38 +0000 (UTC) Date: Thu, 9 Mar 2017 16:16:38 +0000 (UTC) From: "larsonreever (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-9868) Add ability for DistCp to run between 2 clusters MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 09 Mar 2017 16:16:47 -0000 [ https://issues.apache.org/jira/browse/HDFS-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903301#comment-15903301 ] larsonreever commented on HDFS-9868: ------------------------------------ I cannot find the dfs.internal.nameservices configuration property description in any hadoop doc. I only can find it's defined in hdfs-default.xml. From its description in hdfs-default.xml it's hard to associate it with DistCp reading HA cluster file issue. I suggest it should be mentioned in DistCp guide and show an example. But if I missed some doc mentioned it, please tell me. Thanks. http://www.alwaysreview.net/reviews/sky-customer-services/ > Add ability for DistCp to run between 2 clusters > ------------------------------------------------ > > Key: HDFS-9868 > URL: https://issues.apache.org/jira/browse/HDFS-9868 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp > Affects Versions: 2.7.1 > Reporter: NING DING > Assignee: NING DING > Attachments: HDFS-9868.05.patch, HDFS-9868.06.patch, HDFS-9868.07.patch, HDFS-9868.08.patch, HDFS-9868.09.patch, HDFS-9868.10.patch, HDFS-9868.1.patch, HDFS-9868.2.patch, HDFS-9868.3.patch, HDFS-9868.4.patch > > > Normally the HDFS cluster is HA enabled. It could take a long time when coping huge data by distp. If the source cluster changes active namenode, the distp will run failed. This patch supports the DistCp can read source cluster files in HA access mode. A source cluster configuration file needs to be specified (via the -sourceClusterConf option). > The following is an example of the contents of a source cluster configuration > file: > {code:xml} > > > fs.defaultFS > hdfs://mycluster > > > dfs.nameservices > mycluster > > > dfs.ha.namenodes.mycluster > nn1,nn2 > > > dfs.namenode.rpc-address.mycluster.nn1 > host1:9000 > > > dfs.namenode.rpc-address.mycluster.nn2 > host2:9000 > > > dfs.namenode.http-address.mycluster.nn1 > host1:50070 > > > dfs.namenode.http-address.mycluster.nn2 > host2:50070 > > > dfs.client.failover.proxy.provider.mycluster > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > > {code} > The invocation of DistCp is as below: > {code} > bash$ hadoop distcp -sourceClusterConf sourceCluster.xml /foo/bar hdfs://nn2:8020/bar/foo > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org