Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B9EB11CE1 for ; Thu, 24 Apr 2014 20:23:58 +0000 (UTC) Received: (qmail 56435 invoked by uid 500); 24 Apr 2014 20:23:51 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 56322 invoked by uid 500); 24 Apr 2014 20:23:50 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 56315 invoked by uid 99); 24 Apr 2014 20:23:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Apr 2014 20:23:50 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jayunit100@gmail.com designates 209.85.215.53 as permitted sender) Received: from [209.85.215.53] (HELO mail-la0-f53.google.com) (209.85.215.53) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Apr 2014 20:23:46 +0000 Received: by mail-la0-f53.google.com with SMTP id ec20so2442675lab.12 for ; Thu, 24 Apr 2014 13:23:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=MFPSXYMc4B/uQ25DMUlzx0Egb1rdrLJWtgEQLwLGKgw=; b=M1wMzgdLS2hLAK8QMrUyzzuvY3Jmhsj/2x+F15bXJSZ/ZfjAZT6J5Y7NqEBYVMAFIK 9+tW7JRqRiP9wHttj5NXzjDx//jZ4NQ08FtzKDSl1EB3xtcuh6ZOPBHLvz9PvlvI4dpJ v+fLRyR3ypLrxeVTtzxJYVARrmSjaXltP6madfaUKfZgoZnHcXbcjGdDDUmgn2Oje/2C ty2de/o0F7+f6fhxtW1hC0HU0H011rfnBUR95fTV+TZqVaNzeBXkEpBNw76NZucO+/rF X9hMznelGeFuD6hm0l3MWPtATibrD/7KThYXOFUCkQBPaXJ4WX9BfbDfXmrBe8oVOsYv VFJw== MIME-Version: 1.0 X-Received: by 10.152.4.129 with SMTP id k1mr2608719lak.28.1398371003388; Thu, 24 Apr 2014 13:23:23 -0700 (PDT) Received: by 10.112.189.165 with HTTP; Thu, 24 Apr 2014 13:23:23 -0700 (PDT) Date: Thu, 24 Apr 2014 16:23:23 -0400 Message-ID: Subject: Yarn hangs @Scheduled From: Jay Vyas To: "common-user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=089e013d100aebd99704f7cf9e9d X-Virus-Checked: Checked by ClamAV on apache.org --089e013d100aebd99704f7cf9e9d Content-Type: text/plain; charset=UTF-8 Hi folks : My yarn jobs seem to be hanging in the "SHEDULED" state. I've restarted my nodemanager a few times , but no luck. What are the possible reasons that YARN job submission hangs ? I know one is resource availability, but this is a fresh cluster on a VM with only one job, one NM, and one RM. 14/04/24 16:20:32 INFO ipc.Server: Auth successful for yarn@IDH1.LOCAL(auth:SIMPLE) 14/04/24 16:20:32 INFO authorize.ServiceAuthorizationManager: Authorization successful for yarn@IDH1.LOCAL (auth:KERBEROS) for protocol=interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB 14/04/24 16:20:32 INFO resourcemanager.ClientRMService: Allocated new applicationId: 4 14/04/24 16:20:33 INFO resourcemanager.ClientRMService: Application with id 4 submitted by user yarn 14/04/24 16:20:33 INFO resourcemanager.RMAuditLogger: USER=yarn IP=192.168.122.100 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1398370674313_0004 14/04/24 16:20:33 INFO rmapp.RMAppImpl: Storing application with id application_1398370674313_0004 14/04/24 16:20:33 INFO rmapp.RMAppImpl: application_1398370674313_0004 State change from NEW to NEW_SAVING 14/04/24 16:20:33 INFO recovery.RMStateStore: Storing info for app: application_1398370674313_0004 14/04/24 16:20:33 INFO rmapp.RMAppImpl: application_1398370674313_0004 State change from NEW_SAVING to SUBMITTED 14/04/24 16:20:33 INFO fair.FairScheduler: Accepted application application_1398370674313_0004 from user: yarn, in queue: default, currently num of applications: 4 14/04/24 16:20:33 INFO rmapp.RMAppImpl: application_1398370674313_0004 State change from SUBMITTED to ACCEPTED 14/04/24 16:20:33 INFO resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1398370674313_0004_000001 14/04/24 16:20:33 INFO attempt.RMAppAttemptImpl: appattempt_1398370674313_0004_000001 State change from NEW to SUBMITTED 14/04/24 16:20:33 INFO fair.FairScheduler: Added Application Attempt appattempt_1398370674313_0004_000001 to scheduler from user: yarn 14/04/24 16:20:33 INFO attempt.RMAppAttemptImpl: appattempt_1398370674313_0004_000001 State change from SUBMITTED to SCHEDULED -- Jay Vyas http://jayunit100.blogspot.com --089e013d100aebd99704f7cf9e9d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi folks :=C2=A0 My yarn jobs seem to be hanging in the &q= uot;SHEDULED" state.=C2=A0 I've restarted my nodemanager a few tim= es , but no luck.=C2=A0

What are the possible reasons that YARN job= submission hangs ?=C2=A0 I know one is resource availability, but this is = a fresh cluster on a VM with only one job, one NM, and one RM.=C2=A0

14/04/24 16:20:32 INFO ipc.Server: Auth successful for yarn@IDH1.L= OCAL (auth:SIMPLE)
14/04/24 16:20:32 INFO authorize.ServiceAuthorization= Manager: Authorization successful for yarn@IDH1.LOCAL (auth:KERBEROS) for p= rotocol=3Dinterface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB<= br> 14/04/24 16:20:32 INFO resourcemanager.ClientRMService: Allocated new appli= cationId: 4
14/04/24 16:20:33 INFO resourcemanager.ClientRMService: Appl= ication with id 4 submitted by user yarn
14/04/24 16:20:33 INFO resource= manager.RMAuditLogger: USER=3Dyarn IP=3D192.168.122.100=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 OPERATION=3DSubmit Application Request=C2=A0=C2=A0=C2=A0 TARGE= T=3DClientRMService=C2=A0 RESULT=3DSUCCESS=C2=A0 APPID=3Dapplication_139837= 0674313_0004
14/04/24 16:20:33 INFO rmapp.RMAppImpl: Storing application with id applica= tion_1398370674313_0004
14/04/24 16:20:33 INFO rmapp.RMAppImpl: applicat= ion_1398370674313_0004 State change from NEW to NEW_SAVING
14/04/24 16:2= 0:33 INFO recovery.RMStateStore: Storing info for app: application_13983706= 74313_0004
14/04/24 16:20:33 INFO rmapp.RMAppImpl: application_1398370674313_0004 Stat= e change from NEW_SAVING to SUBMITTED
14/04/24 16:20:33 INFO fair.FairSc= heduler: Accepted application application_1398370674313_0004 from user: yar= n, in queue: default, currently num of applications: 4
14/04/24 16:20:33 INFO rmapp.RMAppImpl: application_1398370674313_0004 Stat= e change from SUBMITTED to ACCEPTED
14/04/24 16:20:33 INFO resourcemanag= er.ApplicationMasterService: Registering app attempt : appattempt_139837067= 4313_0004_000001
14/04/24 16:20:33 INFO attempt.RMAppAttemptImpl: appattempt_1398370674313_0= 004_000001 State change from NEW to SUBMITTED
14/04/24 16:20:33 INFO fai= r.FairScheduler: Added Application Attempt appattempt_1398370674313_0004_00= 0001 to scheduler from user: yarn
14/04/24 16:20:33 INFO attempt.RMAppAttemptImpl: appattempt_1398370674313_0= 004_000001 State change from SUBMITTED to SCHEDULED




--
Jay Vyas
http://jayunit100.blogspot.com
--089e013d100aebd99704f7cf9e9d--