Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 265CB11561 for ; Wed, 2 Jul 2014 19:46:27 +0000 (UTC) Received: (qmail 54545 invoked by uid 500); 2 Jul 2014 19:46:26 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 54491 invoked by uid 500); 2 Jul 2014 19:46:26 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 54332 invoked by uid 99); 2 Jul 2014 19:46:26 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Jul 2014 19:46:26 +0000 Date: Wed, 2 Jul 2014 19:46:26 +0000 (UTC) From: "Tsz Wo Nicholas Sze (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-6616) bestNode shouldn't always return the first DataNode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050618#comment-14050618 ] Tsz Wo Nicholas Sze commented on HDFS-6616: ------------------------------------------- > Correct me if I'm wrong: when using WebHDFS, I think it will be very rare that both client and the data will be in the same host. Client and data will be collocated when WebHDFS is used in MapReduce/YARN jobs. When will webhdfs:// be used instead of hdfs:// ? DistCp for coping data across clusters running different Hadoop versions. > bestNode shouldn't always return the first DataNode > --------------------------------------------------- > > Key: HDFS-6616 > URL: https://issues.apache.org/jira/browse/HDFS-6616 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Reporter: zhaoyunjiong > Assignee: zhaoyunjiong > Priority: Minor > Attachments: HDFS-6616.patch > > > When we are doing distcp between clusters, job failed: > 014-06-30 20:56:28,430 INFO org.apache.hadoop.tools.DistCp: FAIL part-r-00101.avro : java.net.NoRouteToHostException: No route to host > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491) > at java.security.AccessController.doPrivileged(Native Method) > at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485) > at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139) > at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379) > at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:322) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:419) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547) > at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > The root reason is one of the DataNode can't access from outside, but inside cluster, it's health. > In NamenodeWebHdfsMethods.java:bestNode, it always return the first DataNode, so even after the distcp retries, it still failed. -- This message was sent by Atlassian JIRA (v6.2#6252)