Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AE836E1CB for ; Sat, 2 Feb 2013 14:44:18 +0000 (UTC) Received: (qmail 42262 invoked by uid 500); 2 Feb 2013 14:44:12 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 41871 invoked by uid 500); 2 Feb 2013 14:44:12 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 41849 invoked by uid 99); 2 Feb 2013 14:44:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Feb 2013 14:44:11 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yypvsxf19870706@gmail.com designates 209.85.128.48 as permitted sender) Received: from [209.85.128.48] (HELO mail-qe0-f48.google.com) (209.85.128.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Feb 2013 14:44:06 +0000 Received: by mail-qe0-f48.google.com with SMTP id 3so2280244qea.35 for ; Sat, 02 Feb 2013 06:43:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=3IIlV8ycfk0BVBJP90ZywsMauDevbLXPWEJr0hEsfzA=; b=qDMxXXOvOfbbTFw237HhDhgybuCKCVg74qVN+xfC9QMajvtWQAwj3zedk8LvYEOvWD wnXTrxSwaPgKDFzdNoyGRSsEpPDkXH57AhVuSJeBAzN6vu9qw2JbUjcfOyxKxGbv5o66 +MMADjIf+72CKFK1n68OOgnxLeW6CIuGeKa6vWNHTP+7G12tJjXiFLpkNCcCoykEQYz/ zDLY8MS8THejUsl0j0qYn4Y0WDq2uEsk9MIpm7W0dYpoDs7RP1hRovAZ2NsXH4V8pWHy M8XQcGhsD2wu9h1FN66dipXENhIs61IhVHBu2DhlK+Q82SzizRg72hkiywr10/hzOqaA kHFA== MIME-Version: 1.0 X-Received: by 10.49.34.146 with SMTP id z18mr19184455qei.29.1359816225220; Sat, 02 Feb 2013 06:43:45 -0800 (PST) Received: by 10.49.82.202 with HTTP; Sat, 2 Feb 2013 06:43:45 -0800 (PST) In-Reply-To: References: Date: Sat, 2 Feb 2013 22:43:45 +0800 Message-ID: Subject: Re: YARN NM containers were killed From: YouPeng Yang To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b33cfb210671004d4bee394 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b33cfb210671004d4bee394 Content-Type: text/plain; charset=ISO-8859-1 Hi All I am sorry to bother you guys, but i have to put up the problem againt . I do want to get clear why the some containers were killed. the details about this situation are descriped in my mail I've posted few days ago. My questions: 1. Why were 2 containers created in Hadoop02,however Hadoop04 got nothing.is it normal ? 2. What is the principle that guides containers to be created. 3. Why were the two containers (the container_*_000003 and the container_*_000002) killed, while the container_*_000001 succeeded. is it normal? Any suggestion will be appreciated. regards YouPeng Yang 2013/1/31 YouPeng Yang > Hi > > I have posted my question for a day,please can somebody help me to > figure out > what the problem is. > Thank you. > regards > YouPeng Yang > > > ---------- Forwarded message ---------- > From: YouPeng Yang > Date: 2013/1/30 > Subject: YARN NM containers were killed > To: user@hadoop.apache.org > > > i've tested the hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar on my hadoop > environment > ( 1 RM - Hadoop01 and 3 NM --Hadoop02,Hadoop03,Hadoop04 > OS:CDH4.1.2 rhel5.5): > ./bin/hadoop jar > share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar > wordcount 1/input output > > when i checked the log .i was confused by the plz: > my hadoop creates 2 containers in Hadoop02,1 container in Hadoop03 > ,however 0 container Hadoop04. > > the result of the containers processing: > > Hadoop02: > * container_1359422495723_0001_01_000001 > (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING > --> KILLING --> EXITED_WITH_SUCCESS) > > the log indates that: > NodeStatusUpdaterImpl: Sending out status for container: container_id {, > app_attempt_id {, application_id {, id: 1, cluster_timestamp: > 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: > "", exit_status: -1000, > ContainerLaunch: Container container_1359422495723_0001_01_000001 > succeeded > Container: Container container_1359422495723_0001_01_000001 transitioned > from RUNNING to EXITED_WITH_SUCCESS > ContainerLaunch: Cleaning up container > container_1359422495723_0001_01_000001 > NMAuditLogger: USER=hadoop OPERATION=Container Finished - Succeeded > TARGET=ContainerImpl RESULT=SUCCESSAPPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000001 > * container_1359422495723_0001_01_000003 > (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING > --> KILLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> DONE) > the log indates that: > NodeStatusUpdaterImpl: Sending out status for container: container_id {, > app_attempt_id {, application_id {, id: 1, cluster_timestamp: > 1359422495723, }, attemptId: 1, }, id: 3, }, state: C_RUNNING, diagnostics: > "Container killed by the ApplicationMaster.\n", exit_status: -1000, > DefaultContainerExecutor: Exit code from task is : 137 > NMAuditLogger: USER=hadoop OPERATION=Container Finished - Killed > TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000003 > > Hadoop03: > * container_1359422495723_0001_01_000002 > (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING > --> KILLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> DONE) > NodeStatusUpdaterImpl: Sending out status for container: container_id {, > app_attempt_id {, application_id {, id: 1, cluster_timestamp: > 1359422495723, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: > "Container killed by the ApplicationMaster.\n", exit_status: -1000, > DefaultContainerExecutor: Exit code from task is : 143 > NMAuditLogger: USER=hadoop OPERATION=Container Finished - Killed > TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000002 > > My questions: > 1. Why were 2 containers created in Hadoop02,however Hadoop04 got > nothing.is it normal ? > 2. What is the principle that guides containers to be created. > 3. Why were the two containers (the container_*_000003 and the > container_*_000002) killed, while the container_*_000001 succeeded. > is it normal? > > > logs of Hadoop01 as follows: > > 2013-01-29 09:23:48,904 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated > new applicationId: 1 > 2013-01-29 09:23:50,201 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application > with id 1 submitted by user hadoop > 2013-01-29 09:23:50,204 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop > IP=10.167.14.221 OPERATION=Submit Application Request > TARGET=ClientRMServiceRESULT=SUCCESS APPID=application_1359422495723_0001 > 2013-01-29 09:23:50,221 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1359422495723_0001 State change from NEW to SUBMITTED > 2013-01-29 09:23:50,221 INFO > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: > Registering appattempt_1359422495723_0001_000001 > 2013-01-29 09:23:50,222 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from NEW to SUBMITTED > 2013-01-29 09:23:50,242 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: > Application Submission: application_1359422495723_0001 from hadoop, > currently active: 1 > 2013-01-29 09:23:50,250 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from SUBMITTED to > SCHEDULED > 2013-01-29 09:23:50,250 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1359422495723_0001 State change from SUBMITTED to ACCEPTED > 2013-01-29 09:23:50,581 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000001 Container Transitioned from NEW to > ALLOCATED > 2013-01-29 09:23:50,581 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Allocated Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000001 > 2013-01-29 09:23:50,581 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Assigned container container_1359422495723_0001_01_000001 of capacity > memory: 1536 on host Hadoop02:39876, which currently has 1 containers, > memory: 1536 used and memory: 6656 available > 2013-01-29 09:23:50,582 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000001 Container Transitioned from > ALLOCATED to ACQUIRED > 2013-01-29 09:23:50,583 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from SCHEDULED to > ALLOCATED > 2013-01-29 09:23:50,587 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > Launching masterappattempt_1359422495723_0001_000001 > 2013-01-29 09:23:50,606 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > Setting up container Container: [ContainerId: > container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, > NodeHttpAddress: Hadoop02:8042, Resource: memory: 1536, Priority: > org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, > Token: null, Status: container_id {, app_attempt_id {, application_id {, > id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, > state: C_NEW, ] for AM appattempt_1359422495723_0001_000001 > 2013-01-29 09:23:50,606 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > Command to launch container container_1359422495723_0001_01_000001 : > $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.mapreduce.container.log.dir= > -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA > -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr > 2013-01-29 09:23:51,030 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done > launching container Container: [ContainerId: > container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, > NodeHttpAddress: Hadoop02:8042, Resource: memory: 1536, Priority: > org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, > Token: null, Status: container_id {, app_attempt_id {, application_id {, > id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, > state: C_NEW, ] for AM appattempt_1359422495723_0001_000001 > 2013-01-29 09:23:51,030 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from ALLOCATED to LAUNCHED > 2013-01-29 09:23:51,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000001 Container Transitioned from ACQUIRED > to RUNNING > 2013-01-29 09:23:57,108 INFO > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM > registration appattempt_1359422495723_0001_000001 > 2013-01-29 09:23:57,109 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop > IP=10.167.14.222 OPERATION=Register App Master > TARGET=ApplicationMasterServicRESULT=SUCCESS > APPID=application_1359422495723_0001 > APPATTEMPTID=appattempt_1359422495723_0001_000001 > 2013-01-29 09:23:57,109 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from LAUNCHED to RUNNING > 2013-01-29 09:23:57,109 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1359422495723_0001 State change from ACCEPTED to RUNNING > 2013-01-29 09:23:58,616 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000002 Container Transitioned from NEW to > ALLOCATED > 2013-01-29 09:23:58,616 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Allocated Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000002 > 2013-01-29 09:23:58,616 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Assigned container container_1359422495723_0001_01_000002 of capacity > memory: 1024 on host Hadoop03:39387, which currently has 1 containers, > memory: 1024 used and memory: 7168 available > 2013-01-29 09:23:59,168 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000002 Container Transitioned from > ALLOCATED to ACQUIRED > 2013-01-29 09:24:00,646 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000003 Container Transitioned from NEW to > ALLOCATED > 2013-01-29 09:24:00,646 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Allocated Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000003 > 2013-01-29 09:24:00,646 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Assigned container container_1359422495723_0001_01_000003 of capacity > memory: 1024 on host Hadoop02:39876, which currently has 2 containers, > memory: 2560 used and memory: 5632 available > 2013-01-29 09:24:00,659 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000002 Container Transitioned from ACQUIRED > to RUNNING > 2013-01-29 09:24:01,196 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000003 Container Transitioned from > ALLOCATED to ACQUIRED > 2013-01-29 09:24:01,657 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000003 Container Transitioned from ACQUIRED > to RUNNING > 2013-01-29 09:24:05,674 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000002 Container Transitioned from RUNNING > to COMPLETED > 2013-01-29 09:24:05,674 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: > Completed container: container_1359422495723_0001_01_000002 in state: > COMPLETED event:FINISHED > 2013-01-29 09:24:05,674 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Released Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000002 > 2013-01-29 09:24:05,674 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Released container container_1359422495723_0001_01_000002 of capacity > memory: 1024 on host Hadoop03:39387, which currently has 0 containers, > memory: 0 used and memory: 8192 available, release resources=true > 2013-01-29 09:24:05,674 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: > Application appattempt_1359422495723_0001_000001 released container > container_1359422495723_0001_01_000002 on node: host: Hadoop03:39387 > #containers=0 available=8192 used=0 with event: FINISHED > 2013-01-29 09:24:07,524 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000003 Container Transitioned from RUNNING > to COMPLETED > 2013-01-29 09:24:07,524 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: > Completed container: container_1359422495723_0001_01_000003 in state: > COMPLETED event:FINISHED > 2013-01-29 09:24:07,524 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Released Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000003 > 2013-01-29 09:24:07,524 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Released container container_1359422495723_0001_01_000003 of capacity > memory: 1024 on host Hadoop02:39876, which currently has 1 containers, > memory: 1536 used and memory: 6656 available, release resources=true > 2013-01-29 09:24:07,525 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: > Application appattempt_1359422495723_0001_000001 released container > container_1359422495723_0001_01_000003 on node: host: Hadoop02:39876 > #containers=1 available=6656 used=1536 with event: FINISHED > 2013-01-29 09:24:11,597 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from RUNNING to FINISHING > 2013-01-29 09:24:11,597 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1359422495723_0001 State change from RUNNING to FINISHING > 2013-01-29 09:24:12,554 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1359422495723_0001_01_000001 Container Transitioned from RUNNING > to COMPLETED > 2013-01-29 09:24:12,554 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: > Completed container: container_1359422495723_0001_01_000001 in state: > COMPLETED event:FINISHED > 2013-01-29 09:24:12,554 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM > Released Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1359422495723_0001 > CONTAINERID=container_1359422495723_0001_01_000001 > 2013-01-29 09:24:12,555 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: > Released container container_1359422495723_0001_01_000001 of capacity > memory: 1536 on host Hadoop02:39876, which currently has 0 containers, > memory: 0 used and memory: 8192 available, release resources=true > 2013-01-29 09:24:12,555 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: > Application appattempt_1359422495723_0001_000001 released container > container_1359422495723_0001_01_000001 on node: host: Hadoop02:39876 > #containers=0 available=8192 used=0 with event: FINISHED > 2013-01-29 09:24:12,556 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1359422495723_0001_000001 State change from FINISHING to FINISHED > 2013-01-29 09:24:12,557 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1359422495723_0001 State change from FINISHING to FINISHED > 2013-01-29 09:24:12,558 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=Application > Finished - Succeeded TARGET=RMAppManager > RESULT=SUCCESSAPPID=application_1359422495723_0001 > 2013-01-29 09:24:12,558 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: > Application application_1359422495723_0001 requests cleared > 2013-01-29 09:24:12,560 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: > Cleaning master appattempt_1359422495723_0001_000001 > 2013-01-29 09:24:12,560 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: > appId=application_1359422495723_0001,name=word > count,user=hadoop,queue=default,state=FINISHED,trackingUrl=Hadoop01:8088/proxy/application_1359422495723_0001/jobhistory/job/job_1359422495723_0001,appMasterHost=Hadoop02,startTime=1359422630195,finishTime=1359422651597 > > > > --047d7b33cfb210671004d4bee394 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi All
=A0 =A0 =A0 I am sorry to bother you guys, but = i have to =A0put up the problem againt .
I do want to get clear w= hy the some =A0= containers=A0 were killed.

the= details about this=A0situation are descriped in my =A0mail I've posted few days ago.
My questions:
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A01. Why =A0were 2 containers created in = Hadoop02,however Hadoop04 got=A0nothing.is=A0it normal ?
2. What is the principle that guides containe= rs to be created.
3. Why were the two container= s (the container_*_000003 and the container_*_000002) =A0killed, while the = container_*_000001 succeeded.
=A0=A0 is it normal?

=A0 =A0
=A0 =A0= =A0
= =A0 =A0= Any suggestion will be appreciated.


regards= =A0
YouPe= ng Yang


2013/1/31 YouPeng Yang <yypvsxf19870706@gmail.com>
Hi=A0

=A0 =A0I have posted m= y question for a day,please can somebody help me to figure =A0out
what the problem is.
=A0 =A0Thank you.
regards
YouPeng Yang


---------- Forwarded message ----------<= br>From: YouPeng Yang &= lt;yypvsxf19= 870706@gmail.com>
Date: 2013/1/30
Subject: YARN NM containers were killed
To: user@hadoop.apache.org=


i've tested the hadoop-mapreduce-= examples-2.0.0-cdh4.1.2.jar on my hadoop environment
( =A0 1 RM - Hadoop01 and 3 NM --Hadoop02,Hadoop03,Hadoop04
= =A0 OS:CDH4.1.2 rhel5.5):
./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh= 4.1.2.jar =A0wordcount 1/input output

when i check= ed the log .i was confused by the plz:
my hadoop creates 2 contai= ners in Hadoop02,1 container in Hadoop03 ,however 0 container Hadoop04.

the result of the containers processing:

=
Hadoop02:
* container_1359422495723_0001_01_000001
(its state changes as follows:NEW --> LOCALIZING= --> LOCALIZED --> RUNNING --> KILLING --> EXITED_WITH_SUCCESS)=
=A0 =A0 =A0 =A0
=A0 =A0 =A0the log indates that:
NodeStatusUpdaterImpl: Sending out status for container: co= ntainer_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp:= 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics= : "", exit_status: -1000,=A0
ContainerLaunch: Contain= er container_1359422495723_0001_01_000001 succeeded=A0
Container: Container container_13594224= 95723_0001_01_000001 transitioned from RUNNING to EXITED_WITH_SUCCESS
ContainerLaunch: Cleanin= g up container container_1359422495723_0001_01_000001
NMAuditLogger: USER=3Dhadoop OPERATION=3DContainer Finished - Succeede= d TARGET=3DContainerImpl RESULT=3DSUCCESSAPPID=3Dapplication_1= 359422495723_0001 CONTAINERID= =3Dcontainer_1359422495723_0001_01_000001
* container_1359422495723_0001_01_000003=
(its state changes= as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING --> KI= LLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> DONE)
the log indates that:
NodeStatusUpdaterImpl= : Sending out status for container: container_id {, app_attempt_id {, appli= cation_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, i= d: 3, }, state: C_RUNNING, diagnostics: "Container killed by the Appli= cationMaster.\n", exit_status: -1000,=A0
DefaultContainerExecutor= : Exit code from task is : 137
NMAuditLogger: USER=3Dhadoop OPERATION=3DContainer Finished - Killed TARGET=3DContainerImpl RESULT=3DSUCCESS APP= ID=3Dapplication_1359422495723_0001 CONTAINERID=3Dcontainer_1359422495723_0001_01_000003

Hadoop03:
=A0 =A0 =A0 =A0 * container_1359422= 495723_0001_01_000002=A0
<= /span>(its state changes as follows:NEW --> LOCALIZING --> LOCALIZED = --> RUNNING --> KILLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> D= ONE)
NodeStatusUpdaterImpl: S= ending out status for container: container_id {, app_attempt_id {, applicat= ion_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: = 2, }, state: C_RUNNING, diagnostics: "Container killed by the Applicat= ionMaster.\n", exit_status: -1000,=A0
=A0 =A0 =A0 =A0 DefaultContainerExecutor: Exit code from task is : 143=
NMAuditLogger: USE= R=3Dhadoop OPERATION=3DContaine= r Finished - Killed TARGET=3DCo= ntainerImpl RESULT=3DSUCCESS APPID=3Dapplication_1359422495723= _0001 CONTAINERID=3Dcontainer_1= 359422495723_0001_01_000002

My questions:
=A0 =A0 =A0 =A0 1. Why =A0were = 2 containers created in Hadoop02,however Hadoop04 got nothing.is it normal ?
2. What is the principle that guides con= tainers to be created.
3. Why were the two cont= ainers (the container_*_000003 and the container_*_000002) =A0killed, while= the container_*_000001 succeeded.
=A0 is it normal?


logs of Hadoop01 as follows:

2013= -01-29 09:23:48,904 INFO org.apache.hadoop.yarn.server.resourcemanager.Clie= ntRMService: Allocated new applicationId: 1
2013-01-29 09:23:50,201 INFO org.apache.hadoop.yarn.server.resourceman= ager.ClientRMService: Application with id 1 submitted by user hadoop
<= div>2013-01-29 09:23:50,204 INFO org.apache.hadoop.yarn.server.resourcemana= ger.RMAuditLogger: USER=3Dhadoop IP=3D10.167.14.221 OPERATION= =3DSubmit Application Request T= ARGET=3DClientRMServiceRESULT=3DSUCCESS APPID=3Dapplication_1359422495723_0001
2013-01-29 09:23:50,221 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.RMAppImpl: application_1359422495723_0001 State change from NEW = to SUBMITTED
2013-01-29 09:23:50,221 INFO org.apache.hadoop.yarn.= server.resourcemanager.ApplicationMasterService: Registering appattempt_135= 9422495723_0001_000001
2013-01-29 09:23:50,222 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from NEW to SUBMITTED
2013-01-29 09:23:50,242 INFO or= g.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: A= pplication Submission: application_1359422495723_0001 from hadoop, currentl= y active: 1
2013-01-29 09:23:50,250 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from SUBMITTED to SCHEDULED
2013-01-29 09:23:50,250 I= NFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: applicat= ion_1359422495723_0001 State change from SUBMITTED to ACCEPTED
2013-01-29 09:23:50,581 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Co= ntainer Transitioned from NEW to ALLOCATED
2013-01-29 09:23:50,58= 1 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=3D= hadoop OPERATION=3DAM Allocated= Container TARGET=3DSchedulerAp= p RESULT=3DSUCCESS APPID=3Dapplication_1359422495723_0001 CONTAINERID=3Dcontainer_135942249= 5723_0001_01_000001
2013-01-29 09:23:50,581 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_= 1359422495723_0001_01_000001 of capacity memory: 1536 on host Hadoop02:3987= 6, which currently has 1 containers, memory: 1536 used and memory: 6656 ava= ilable
2013-01-29 09:23:50,582 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Co= ntainer Transitioned from ALLOCATED to ACQUIRED
2013-01-29 09:23:= 50,583 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMA= ppAttemptImpl: appattempt_1359422495723_0001_000001 State change from SCHED= ULED to ALLOCATED
2013-01-29 09:23:50,587 INFO org.apache.hadoop.yarn.server.resourceman= ager.amlauncher.AMLauncher: Launching masterappattempt_1359422495723_0001_0= 00001
2013-01-29 09:23:50,606 INFO org.apache.hadoop.yarn.server.= resourcemanager.amlauncher.AMLauncher: Setting up container Container: [Con= tainerId: container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, N= odeHttpAddress: Hadoop02:8042, Resource: memory: 1536, Priority: org.apache= .hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, Token: null= , Status: container_id {, app_attempt_id {, application_id {, id: 1, cluste= r_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_NEW, ] f= or AM appattempt_1359422495723_0001_000001
2013-01-29 09:23:50,606 INFO org.apache.hadoop.yarn.server.resourceman= ager.amlauncher.AMLauncher: Command to launch container container_135942249= 5723_0001_01_000001 : $JAVA_HOME/bin/java -Dlog4j.configuration=3Dcontainer= -log4j.properties -Dyarn.app.mapreduce.container.log.dir=3D<LOG_DIR> = -Dyarn.app.mapreduce.container.log.filesize=3D0 -Dhadoop.root.logger=3DINFO= ,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_= DIR>/stdout 2><LOG_DIR>/stderr=A0
2013-01-29 09:23:51,030 INFO org.apache.hadoop.yarn.server.resourceman= ager.amlauncher.AMLauncher: Done launching container Container: [ContainerI= d: container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, NodeHttp= Address: Hadoop02:8042, Resource: memory: 1536, Priority: org.apache.hadoop= .yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, Token: null, Statu= s: container_id {, app_attempt_id {, application_id {, id: 1, cluster_times= tamp: 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_NEW, ] for AM a= ppattempt_1359422495723_0001_000001
2013-01-29 09:23:51,030 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from ALLOCATED to LAUNCHED
2013-01-29 09:23:51,575 IN= FO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImp= l: container_1359422495723_0001_01_000001 Container Transitioned from ACQUI= RED to RUNNING
2013-01-29 09:23:57,108 INFO org.apache.hadoop.yarn.server.resourceman= ager.ApplicationMasterService: AM registration appattempt_1359422495723_000= 1_000001
2013-01-29 09:23:57,109 INFO org.apache.hadoop.yarn.serv= er.resourcemanager.RMAuditLogger: USER=3Dhadoop IP=3D10.167.14.222 OPERATION=3DRegister App Master <= /span>TARGET=3DApplicationMasterServicRESULT=3DSUCCESS APPID=3Dapplication_1359422495723_0001 APPATTEMPTID=3Dappattempt_1359422495723_0= 001_000001
2013-01-29 09:23:57,109 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from LAUNCHED to RUNNING
2013-01-29 09:23:57,109 INFO= org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application= _1359422495723_0001 State change from ACCEPTED to RUNNING
2013-01-29 09:23:58,616 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Co= ntainer Transitioned from NEW to ALLOCATED
2013-01-29 09:23:58,61= 6 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=3D= hadoop OPERATION=3DAM Allocated= Container TARGET=3DSchedulerAp= p RESULT=3DSUCCESS APPID=3Dapplication_1359422495723_0001 CONTAINERID=3Dcontainer_135942249= 5723_0001_01_000002
2013-01-29 09:23:58,616 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_= 1359422495723_0001_01_000002 of capacity memory: 1024 on host Hadoop03:3938= 7, which currently has 1 containers, memory: 1024 used and memory: 7168 ava= ilable
2013-01-29 09:23:59,168 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Co= ntainer Transitioned from ALLOCATED to ACQUIRED
2013-01-29 09:24:= 00,646 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMCon= tainerImpl: container_1359422495723_0001_01_000003 Container Transitioned f= rom NEW to ALLOCATED
2013-01-29 09:24:00,646 INFO org.apache.hadoop.yarn.server.resourceman= ager.RMAuditLogger: USER=3Dhadoop OPERATION=3DAM Allocated Container = TARGET=3DSchedulerApp RE= SULT=3DSUCCESS APPID=3Dapplicat= ion_1359422495723_0001 CONTAINE= RID=3Dcontainer_1359422495723_0001_01_000003
2013-01-29 09:24:00,646 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_= 1359422495723_0001_01_000003 of capacity memory: 1024 on host Hadoop02:3987= 6, which currently has 2 containers, memory: 2560 used and memory: 5632 ava= ilable
2013-01-29 09:24:00,659 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Co= ntainer Transitioned from ACQUIRED to RUNNING
2013-01-29 09:24:01= ,196 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMConta= inerImpl: container_1359422495723_0001_01_000003 Container Transitioned fro= m ALLOCATED to ACQUIRED
2013-01-29 09:24:01,657 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Co= ntainer Transitioned from ACQUIRED to RUNNING
2013-01-29 09:24:05= ,674 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMConta= inerImpl: container_1359422495723_0001_01_000002 Container Transitioned fro= m RUNNING to COMPLETED
2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container= _1359422495723_0001_01_000002 in state: COMPLETED event:FINISHED
2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.= RMAuditLogger: USER=3Dhadoop OP= ERATION=3DAM Released Container TARGET=3DSchedulerApp RESULT= =3DSUCCESS APPID=3Dapplication_= 1359422495723_0001 CONTAINERID= =3Dcontainer_1359422495723_0001_01_000002
2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Released container container_= 1359422495723_0001_01_000002 of capacity memory: 1024 on host Hadoop03:3938= 7, which currently has 0 containers, memory: 0 used and memory: 8192 availa= ble, release resources=3Dtrue
2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_000= 1_000001 released container container_1359422495723_0001_01_000002 on node:= host: Hadoop03:39387 #containers=3D0 available=3D8192 used=3D0 with event:= FINISHED
2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Co= ntainer Transitioned from RUNNING to COMPLETED
2013-01-29 09:24:0= 7,524 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.f= ica.FiCaSchedulerApp: Completed container: container_1359422495723_0001_01_= 000003 in state: COMPLETED event:FINISHED
2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourceman= ager.RMAuditLogger: USER=3Dhadoop OPERATION=3DAM Released Container <= /span>TARGET=3DSchedulerApp RES= ULT=3DSUCCESS APPID=3Dapplicati= on_1359422495723_0001 CONTAINER= ID=3Dcontainer_1359422495723_0001_01_000003
2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Released container container_= 1359422495723_0001_01_000003 of capacity memory: 1024 on host Hadoop02:3987= 6, which currently has 1 containers, memory: 1536 used and memory: 6656 ava= ilable, release resources=3Dtrue
2013-01-29 09:24:07,525 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_000= 1_000001 released container container_1359422495723_0001_01_000003 on node:= host: Hadoop02:39876 #containers=3D1 available=3D6656 used=3D1536 with eve= nt: FINISHED
2013-01-29 09:24:11,597 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from RUNNING to FINISHING
2013-01-29 09:24:11,597 INF= O org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: applicatio= n_1359422495723_0001 State change from RUNNING to FINISHING
2013-01-29 09:24:12,554 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Co= ntainer Transitioned from RUNNING to COMPLETED
2013-01-29 09:24:1= 2,554 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.f= ica.FiCaSchedulerApp: Completed container: container_1359422495723_0001_01_= 000001 in state: COMPLETED event:FINISHED
2013-01-29 09:24:12,554 INFO org.apache.hadoop.yarn.server.resourceman= ager.RMAuditLogger: USER=3Dhadoop OPERATION=3DAM Released Container <= /span>TARGET=3DSchedulerApp RES= ULT=3DSUCCESS APPID=3Dapplicati= on_1359422495723_0001 CONTAINER= ID=3Dcontainer_1359422495723_0001_01_000001
2013-01-29 09:24:12,555 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.common.fica.FiCaSchedulerNode: Released container container_= 1359422495723_0001_01_000001 of capacity memory: 1536 on host Hadoop02:3987= 6, which currently has 0 containers, memory: 0 used and memory: 8192 availa= ble, release resources=3Dtrue
2013-01-29 09:24:12,555 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_000= 1_000001 released container container_1359422495723_0001_01_000001 on node:= host: Hadoop02:39876 #containers=3D0 available=3D8192 used=3D0 with event:= FINISHED
2013-01-29 09:24:12,556 INFO org.apache.hadoop.yarn.server.resourceman= ager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 S= tate change from FINISHING to FINISHED
2013-01-29 09:24:12,557 IN= FO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: applicati= on_1359422495723_0001 State change from FINISHING to FINISHED
2013-01-29 09:24:12,558 INFO org.apache.hadoop.yarn.server.resourceman= ager.RMAuditLogger: USER=3Dhadoop OPERATION=3DApplication Finished - Succeeded TARGET=3DRMAppManager= RESULT=3DSUCCESSAPPID=3Dapplication_1359422495723_0001
2013-01-29 09:24:12,558 INFO org.apache.hadoop.yarn.server.resourceman= ager.scheduler.AppSchedulingInfo: Application application_1359422495723_000= 1 requests cleared
2013-01-29 09:24:12,560 INFO org.apache.hadoop= .yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appatte= mpt_1359422495723_0001_000001
2013-01-29 09:24:12,560 INFO org.apache.hadoop.yarn.server.resourceman= ager.RMAppManager$ApplicationSummary: appId=3Dapplication_1359422495723_000= 1,name=3Dword count,user=3Dhadoop,queue=3Ddefault,state=3DFINISHED,tracking= Url=3DHadoop01:8088/proxy/application_1359422495723_0001/jobhistory/job/job= _1359422495723_0001,appMasterHost=3DHadoop02,startTime=3D1359422630195,fini= shTime=3D1359422651597




--047d7b33cfb210671004d4bee394--