Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ADEDD10475 for ; Tue, 22 Oct 2013 04:56:41 +0000 (UTC) Received: (qmail 21906 invoked by uid 500); 22 Oct 2013 04:56:29 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 21759 invoked by uid 500); 22 Oct 2013 04:56:18 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 21751 invoked by uid 99); 22 Oct 2013 04:56:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Oct 2013 04:56:13 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of psybers@gmail.com designates 209.85.215.51 as permitted sender) Received: from [209.85.215.51] (HELO mail-la0-f51.google.com) (209.85.215.51) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Oct 2013 04:56:06 +0000 Received: by mail-la0-f51.google.com with SMTP id ea20so2354314lab.10 for ; Mon, 21 Oct 2013 21:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; bh=IeXeS1o34QAjR50BxVFyrXTl89HfgLNnki1sq42bHyE=; b=Ci07evvGVIBcCP818Jdo0PF0+r/0Qx/oOgCvVTKZTFgwIOT67VS2ZuKSoGRh4MODZk 40gf+x+cjEsgcy6qH6dc7F1/a+dPjKRKCuMMCkrP5Qloz9Ua5E7lvklf0GUGIgXMI/Cp 2Ix8eJaMxMyOp5v+DLkGjlTFKjFe5ISQYTbtZfO7hQ4JQyP4kB8NuUA/nOC3l7ch2iNP kiA/8vQ8cdpOQCnszSiFsnvP7rNzlaETHf2hKW9P1xDVHup3w2rkCzI+q8ccpaekHxAW LS9r5+zfPPbfMpQeTpoXdPvXiix8e2CNE2dKHEbQCkhnlb9T7gh28eYCHhzkz2UZDZUc JKVA== MIME-Version: 1.0 X-Received: by 10.112.155.70 with SMTP id vu6mr199933lbb.41.1382417745861; Mon, 21 Oct 2013 21:55:45 -0700 (PDT) Received: by 10.112.11.35 with HTTP; Mon, 21 Oct 2013 21:55:45 -0700 (PDT) Reply-To: rdyer@iastate.edu Date: Mon, 21 Oct 2013 23:55:45 -0500 Message-ID: Subject: Hadoop 2.2.0 MR tasks failing From: Robert Dyer To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=089e0115fe1eac602404e94d365c X-Virus-Checked: Checked by ClamAV on apache.org --089e0115fe1eac602404e94d365c Content-Type: text/plain; charset=ISO-8859-1 I recently setup a 2.2.0 test cluster. For some reason, all of my MR jobs are failing. The maps and reduces all run to completion, without any errors. Yet the app is marked failed and there is no final output. Any ideas? Application Type: MAPREDUCE State: FINISHED FinalStatus: FAILED Diagnostics: We crashed durring a commit I notice in the logs this (but not sure what to make of it): 2013-10-21 23:42:41,379 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 789 for container-id container_1382415258498_0002_01_000001: 250.4 MB of 2 GB physical memory used; 2.0 GB of 6 GB virtual memory used 2013-10-21 23:42:41,743 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1382415258498_0002_01_000001 is : 255 2013-10-21 23:42:41,744 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1382415258498_0002_01_000001 and exit code: 255 org.apache.hadoop.util.Shell$ExitCodeException: 2013-10-21 23:42:41,746 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: 2013-10-21 23:42:41,747 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 255 2013-10-21 23:42:41,747 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1382415258498_0002_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 2013-10-21 23:42:41,747 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1382415258498_0002_01_000001 2013-10-21 23:42:41,764 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /hadoop/hadoop-2.2.0/cluster-data/usercache/hadoop/appcache/application_1382415258498_0002/container_1382415258498_0002_01_000001 2013-10-21 23:42:41,765 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1382415258498_0002 CONTAINERID=container_1382415258498_0002_01_000001 --089e0115fe1eac602404e94d365c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I recently setup a 2.2.0 test cluster. =A0For some re= ason, all of my MR jobs are failing. =A0The maps and reduces all run to com= pletion, without any errors. =A0Yet the app is marked failed and there is n= o final output. =A0Any ideas?

Application Type: MAPREDUCE
State: FINISHED
FinalStatu= s: FAILED
Diagnostics:=A0We crashed durring a commit

<= div>I notice in the logs this (but not sure what to make of it):
2=
013-10-21 23:42:41,379 INFO org.apache.hadoop.yarn.server.nodemanager.conta=
inermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 789 =
for container-id container_1382415258498_0002_01_000001: 250.4 MB of 2 GB p=
hysical memory used; 2.0 GB of 6 GB virtual memory used
2013-10-21 23:42:41,743 WARN org.apache.hadoop.yarn.server.nodemanager.Defa=
ultContainerExecutor: Exit code from container container_1382415258498_0002=
_01_000001 is : 255
2013-10-21 23:42:41,744 WARN org.apache.hadoop.yarn.server.nodemanager.Defa=
ultContainerExecutor: Exception from container-launch with container ID: co=
ntainer_1382415258498_0002_01_000001 and exit code: 255
org.apache.hadoop.util.Shell$ExitCodeException: 
2013-10-21 23:42:41,746 INFO org.apache.=
hadoop.yarn.server.nodemanager.ContainerExecutor:=20
2013-10-21 23:42:41,747 WARN org.apache.hadoop.yarn.server.nodemanager.cont=
ainermanager.launcher.ContainerLaunch: Container exited with a non-zero exi=
t code 255
2013-10-21 23:42:41,747 INFO org.apache.hadoop.yarn.server.nodemanager.cont=
ainermanager.container.Container: Container container_1382415258498_0002_01=
_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
2013-10-21 23:42:41,747 INFO org.apache.hadoop.yarn.server.nodemanager.cont=
ainermanager.launcher.ContainerLaunch: Cleaning up container container_1382=
415258498_0002_01_000001
2013-10-21 23:42:41,764 INFO org.apache.hadoop.yarn.server.nodemanager.Defa=
ultContainerExecutor: Deleting absolute path : /hadoop/hadoop-2.2.0/cluster=
-data/usercache/hadoop/appcache/application_1382415258498_0002/container_13=
82415258498_0002_01_000001
2013-10-21 23:42:41,765 WARN org.apache.hadoop.yarn.server.nodemanager.NMAu=
ditLogger: USER=3Dhadoop	OPERATION=3DContainer Finished - Failed	TARGET=3DC=
ontainerImpl	RESULT=3DFAILURE	DESCRIPTION=3DContainer failed with state: EX=
ITED_WITH_FAILURE	APPID=3Dapplication_1382415258498_0002	CONTAINERID=3Dcont=
ainer_1382415258498_0002_01_000001
--089e0115fe1eac602404e94d365c--