Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 269C610737 for ; Wed, 28 Aug 2013 11:43:11 +0000 (UTC) Received: (qmail 83104 invoked by uid 500); 28 Aug 2013 11:43:05 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 82994 invoked by uid 500); 28 Aug 2013 11:43:04 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82987 invoked by uid 99); 28 Aug 2013 11:43:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Aug 2013 11:43:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xeonmailinglist@gmail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Aug 2013 11:42:58 +0000 Received: by mail-bk0-f41.google.com with SMTP id na10so2122857bkb.14 for ; Wed, 28 Aug 2013 04:42:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=khd3PLBv49FPeR+RZrP7XoOT2dGu2MdvsgMvodIuu7o=; b=X9YNnJnlxKdAqyiE274FLE6xQanEosHUG3WaevLTtEJOY1lg4zBeu9izimFf4C0Gt9 vYFUmLHF7xUFqNejq+sOUtirZ5N2capTnT/IPBiqjEQ1pb4e5kVuIUfMEUpsRfFMC3d9 EHbjc4vYks6k5ON5I6iRcCUfsyv2HNcqG4TstU6nBHqQJbZ8oVTVigDCYdwQ0G1LvT1j SLe6Ca9AvyUD7wLEwi0idPu/sIfbDRT9xK/E1y3aUz4hr7gz6KVtV3+EtBOhl/9e+5LS zS8tX9bntm0NExyxs/qrAD54i0Tc8+GMYTL9yJY4qCEcJ6NhtAJY0HW2njaYkwKXYBzC 4K8w== X-Received: by 10.204.225.196 with SMTP id it4mr886736bkb.34.1377690156708; Wed, 28 Aug 2013 04:42:36 -0700 (PDT) Received: from [192.168.1.118] (67-103.0-85.cust.bluewin.ch. [85.0.103.67]) by mx.google.com with ESMTPSA id w9sm5799029bkn.12.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 28 Aug 2013 04:42:36 -0700 (PDT) Message-ID: <521DE224.6000208@gmail.com> Date: Wed, 28 Aug 2013 12:42:28 +0100 From: xeon User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: user@hadoop.apache.org Subject: Re: There are 2 datanode(s) running and 2 node(s) are excluded in this operation. References: <521DCE17.2000006@gmail.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------010301090707040507020907" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------010301090707040507020907 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I am copying from Site1 to Site2 in Amazon EC2. Each site has mounted 1 hadoop instance with 1 NN and 2 DNs. I think my problem is something related distcp command being unable to query one of the HDFS' clusters' DNs like Harsh told. Amazon EC2 use private IPs to communicate between hosts in the same site and public IPs to connect between sites. I am using distcp to copy big data (256MB each file) between different sites using the public IP: hadoop distcp hdfs://publicIP1:9000/wiki hdfs://publicIP2:9000/wiki Checking the logs (hadoop-ubuntu-namenode-ip-XX-XXX-XXX-94.log), I also got this problem but I don't know if it is related: 2013-08-28 07:40:17,537 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.abandonBlock from XX.XXX.XXX.150:51844: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /wiki/.distcp.tmp.attempt_1377674569447_0001_m_000000_1: File does not exist. H older DFSClient_attempt_1377674569447_0001_m_000000_1_-1864394963_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /wiki/.distcp.tmp.attempt_1377674569447_0001_m_000000_1: File does not exist. Holder DFSClient_ attempt_1377674569447_0001_m_000000_1_-1864394963_1 does not have any open files. Here is the report: Site1 $ hdfs dfsadmin -report 13/08/28 11:17:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Configured Capacity: 25365413888 (23.62 GB) Present Capacity: 20488749056 (19.08 GB) DFS Remaining: 20487512064 (19.08 GB) DFS Used: 1236992 (1.18 MB) DFS Used%: 0.01% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 2 (2 total, 0 dead) Live datanodes: Name: XX.XXX.XXX.243:50010 (ip-XX-XXX-XXX-243.eu-west-1.compute.internal) Hostname: ip-XX-XXX-XXX-243.eu-west-1.compute.internal Decommission Status : Normal Configured Capacity: 12682706944 (11.81 GB) DFS Used: 618496 (604 KB) Non DFS Used: 2489556992 (2.32 GB) DFS Remaining: 10192531456 (9.49 GB) DFS Used%: 0.00% DFS Remaining%: 80.37% Last contact: Wed Aug 28 11:17:51 UTC 2013 Name: XX.XXX.XXX.58:50010 (ip-XX-XXX-XXX-58.eu-west-1.compute.internal) Hostname: ip-XX-XXX-XXX-58.eu-west-1.compute.internal Decommission Status : Normal Configured Capacity: 12682706944 (11.81 GB) DFS Used: 618496 (604 KB) Non DFS Used: 2387107840 (2.22 GB) DFS Remaining: 10294980608 (9.59 GB) DFS Used%: 0.00% DFS Remaining%: 81.17% Last contact: Wed Aug 28 11:17:51 UTC 2013 Site2: $ hdfs dfsadmin -report 13/08/28 11:17:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Configured Capacity: 25365413888 (23.62 GB) Present Capacity: 20593844224 (19.18 GB) DFS Remaining: 20593770496 (19.18 GB) DFS Used: 73728 (72 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 2 (2 total, 0 dead) Live datanodes: Name: XX.XXX.XXX.95:50010 (ip-XX-XXX-XXX-95.ap-southeast-1.compute.internal) Hostname: ip-XX-XXX-XXX-95.ap-southeast-1.compute.internal Decommission Status : Normal Configured Capacity: 12682706944 (11.81 GB) DFS Used: 36864 (36 KB) Non DFS Used: 2385768448 (2.22 GB) DFS Remaining: 10296901632 (9.59 GB) DFS Used%: 0.00% DFS Remaining%: 81.19% Last contact: Wed Aug 28 11:17:56 UTC 2013 Name: XX.XXX.XXX.96:50010 (ip-XX-XXX-XXX-96.ap-southeast-1.compute.internal) Hostname: ip-XX-XXX-XXX-96.ap-southeast-1.compute.internal Decommission Status : Normal Configured Capacity: 12682706944 (11.81 GB) DFS Used: 36864 (36 KB) Non DFS Used: 2385801216 (2.22 GB) DFS Remaining: 10296868864 (9.59 GB) DFS Used%: 0.00% DFS Remaining%: 81.19% Last contact: Wed Aug 28 11:17:56 UTC 2013 Any suggestion to fix this problem? On 08/28/2013 12:09 PM, Jitendra Yadav wrote: > Hi, > Also can you please share the dfs heath check report of your cluster? > Thanks > On Wed, Aug 28, 2013 at 3:46 PM, xeon > wrote: > > Hi, > > I don't have the "dfs.hosts.exclude" property defined, but I > still get the error "There are 2 datanode(s) running and 2 node(s) > are excluded in this operation." when I run the distcp command. > Any help? > > --------------010301090707040507020907 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit
I am copying from Site1 to Site2 in Amazon EC2. Each site has mounted 1 hadoop instance with 1 NN and 2 DNs. I think my problem is something related distcp command being unable to query one of the HDFS' clusters' DNs like Harsh told. Amazon EC2 use private IPs to communicate between hosts in the same site and public IPs to connect between sites. I am using distcp to copy big data (256MB each file) between different sites using the public IP:
hadoop distcp hdfs://publicIP1:9000/wiki hdfs://publicIP2:9000/wiki

Checking the logs (hadoop-ubuntu-namenode-ip-XX-XXX-XXX-94.log), I also got this problem  but I don't know if it is related:
2013-08-28 07:40:17,537 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.abandonBlock from XX.XXX.XXX.150:51844: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /wiki/.distcp.tmp.attempt_1377674569447_0001_m_000000_1: File does not exist. H
older DFSClient_attempt_1377674569447_0001_m_000000_1_-1864394963_1 does not have any open files.
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /wiki/.distcp.tmp.attempt_1377674569447_0001_m_000000_1: File does not exist. Holder DFSClient_
attempt_1377674569447_0001_m_000000_1_-1864394963_1 does not have any open files.


Here is the report:

Site1
$ hdfs dfsadmin -report
13/08/28 11:17:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 25365413888 (23.62 GB)
Present Capacity: 20488749056 (19.08 GB)
DFS Remaining: 20487512064 (19.08 GB)
DFS Used: 1236992 (1.18 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Live datanodes:
Name: XX.XXX.XXX.243:50010 (ip-XX-XXX-XXX-243.eu-west-1.compute.internal)
Hostname: ip-XX-XXX-XXX-243.eu-west-1.compute.internal
Decommission Status : Normal
Configured Capacity: 12682706944 (11.81 GB)
DFS Used: 618496 (604 KB)
Non DFS Used: 2489556992 (2.32 GB)
DFS Remaining: 10192531456 (9.49 GB)
DFS Used%: 0.00%
DFS Remaining%: 80.37%
Last contact: Wed Aug 28 11:17:51 UTC 2013


Name: XX.XXX.XXX.58:50010 (ip-XX-XXX-XXX-58.eu-west-1.compute.internal)
Hostname: ip-XX-XXX-XXX-58.eu-west-1.compute.internal
Decommission Status : Normal
Configured Capacity: 12682706944 (11.81 GB)
DFS Used: 618496 (604 KB)
Non DFS Used: 2387107840 (2.22 GB)
DFS Remaining: 10294980608 (9.59 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.17%
Last contact: Wed Aug 28 11:17:51 UTC 2013


Site2:
$ hdfs dfsadmin -report
13/08/28 11:17:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 25365413888 (23.62 GB)
Present Capacity: 20593844224 (19.18 GB)
DFS Remaining: 20593770496 (19.18 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Live datanodes:
Name: XX.XXX.XXX.95:50010 (ip-XX-XXX-XXX-95.ap-southeast-1.compute.internal)
Hostname: ip-XX-XXX-XXX-95.ap-southeast-1.compute.internal
Decommission Status : Normal
Configured Capacity: 12682706944 (11.81 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 2385768448 (2.22 GB)
DFS Remaining: 10296901632 (9.59 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.19%
Last contact: Wed Aug 28 11:17:56 UTC 2013


Name: XX.XXX.XXX.96:50010 (ip-XX-XXX-XXX-96.ap-southeast-1.compute.internal)
Hostname: ip-XX-XXX-XXX-96.ap-southeast-1.compute.internal
Decommission Status : Normal
Configured Capacity: 12682706944 (11.81 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 2385801216 (2.22 GB)
DFS Remaining: 10296868864 (9.59 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.19%
Last contact: Wed Aug 28 11:17:56 UTC 2013

Any suggestion to fix this problem?






On 08/28/2013 12:09 PM, Jitendra Yadav wrote:
Hi,
 
Also can you please share the dfs heath check report of your cluster?
 
Thanks
On Wed, Aug 28, 2013 at 3:46 PM, xeon <xeonmailinglist@gmail.com> wrote:
Hi,

 I don't have the "dfs.hosts.exclude" property defined, but I still get the error "There are 2 datanode(s) running and 2 node(s) are excluded in this operation." when I run the distcp command. Any help?


--------------010301090707040507020907--