Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C5EE105D4 for ; Fri, 4 Apr 2014 21:03:20 +0000 (UTC) Received: (qmail 1564 invoked by uid 500); 4 Apr 2014 21:03:18 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 1380 invoked by uid 500); 4 Apr 2014 21:03:16 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 1345 invoked by uid 99); 4 Apr 2014 21:03:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Apr 2014 21:03:14 +0000 Date: Fri, 4 Apr 2014 21:03:14 +0000 (UTC) From: "Zhijie Shen (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1903) TestNMClient fails occasionally MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960418#comment-13960418 ] Zhijie Shen commented on YARN-1903: ----------------------------------- I found the following log: {code} 2014-04-04 05:08:01,361 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatusInternal(785)) - Returning ContainerStatus: [ContainerId: container_1396613275302_0001_01_000004, State: RUNNING, Diagnostics: , ExitStatus: -1000, ] 2014-04-04 05:08:01,365 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:stopContainerInternal(718)) - Stopping container with container Id: container_1396613275302_0001_01_000004 2014-04-04 05:08:01,366 INFO nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=jenkins IP=10.79.62.28 OPERATION=Stop Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1396613275302_0001 CONTAINERID=container_1396613275302_0001_01_000004 2014-04-04 05:08:01,387 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:isEnabled(169)) - Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread 2014-04-04 05:08:01,387 INFO containermanager.AuxServices (AuxServices.java:handle(175)) - Got event CONTAINER_STOP for appId application_1396613275302_0001 2014-04-04 05:08:01,389 INFO application.Application (ApplicationImpl.java:transition(296)) - Adding container_1396613275302_0001_01_000004 to application application_1396613275302_0001 2014-04-04 05:08:01,389 INFO nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89)) - USER=jenkins OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1396613275302_0001 CONTAINERID=container_1396613275302_0001_01_000004 2014-04-04 05:08:01,389 INFO container.Container (ContainerImpl.java:handle(884)) - Container container_1396613275302_0001_01_000004 transitioned from NEW to DONE 2014-04-04 05:08:01,389 INFO application.Application (ApplicationImpl.java:transition(339)) - Removing container_1396613275302_0001_01_000004 from application application_1396613275302_0001 2014-04-04 05:08:01,390 INFO util.ProcfsBasedProcessTree (ProcfsBasedProcessTree.java:isAvailable(182)) - ProcfsBasedProcessTree currently is supported only on Linux. 2014-04-04 05:08:01,392 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(321)) - container_1396613275302_0001_01_000004 Container Transitioned from ACQUIRED to RUNNING 2014-04-04 05:08:01,393 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatusInternal(771)) - Getting container-status for container_1396613275302_0001_01_000004 2014-04-04 05:08:01,393 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatusInternal(785)) - Returning ContainerStatus: [ContainerId: container_1396613275302_0001_01_000004, State: COMPLETE, Diagnostics: , ExitStatus: -1000, ] {code} When the kill event is received, the container is still at NEW, it is moved to DONE by going through ContainerDoneTransition, which won't set the killing related exitcode and diagnostics. > TestNMClient fails occasionally > ------------------------------- > > Key: YARN-1903 > URL: https://issues.apache.org/jira/browse/YARN-1903 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Zhijie Shen > Assignee: Zhijie Shen > > The container status after stopping container is not expected. > {code} > java.lang.AssertionError: 4: > at org.junit.Assert.fail(Assert.java:93) > at org.junit.Assert.assertTrue(Assert.java:43) > at org.apache.hadoop.yarn.client.api.impl.TestNMClient.testGetContainerStatus(TestNMClient.java:382) > at org.apache.hadoop.yarn.client.api.impl.TestNMClient.testContainerManagement(TestNMClient.java:346) > at org.apache.hadoop.yarn.client.api.impl.TestNMClient.testNMClient(TestNMClient.java:226) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)