Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1B70174F4 for ; Mon, 27 Apr 2015 18:45:20 +0000 (UTC) Received: (qmail 2439 invoked by uid 500); 27 Apr 2015 18:45:15 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 2320 invoked by uid 500); 27 Apr 2015 18:45:15 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 2297 invoked by uid 99); 27 Apr 2015 18:45:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 18:45:14 +0000 X-ASF-Spam-Status: No, hits=3.9 required=5.0 tests=FORGED_YAHOO_RCVD,HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: encountered temporary error during SPF processing of domain of r.oukpedjo@yahoo.com) Received: from [54.76.25.247] (HELO mx1-eu-west.apache.org) (54.76.25.247) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 18:44:46 +0000 Received: from nm7-vm4.bullet.mail.ne1.yahoo.com (nm7-vm4.bullet.mail.ne1.yahoo.com [98.138.91.167]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 1069E24A07 for ; Mon, 27 Apr 2015 18:44:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1430160262; bh=kWmVMx/lrL9PnAamQ5uG7bhUFMYuWq0+tF9/h5e14Qw=; h=Date:From:Reply-To:To:Subject:From:Subject; b=JbzqCwLIP+Cgl8TpG8QuALb4LEJV/9qA80OINCC950ZGMj21t+lkgn3/IgvrvzWf2KsY0urxF8LtORc0NwvJ+ZT89oEe9VzT8VTgcCxpaX5M7hxxPfLVDaVx27kDhphSYUU6cgx5283Vzk8ztp+urzPEudk6EHrdOhzUdgbGABhdnRu6RSLMFa8J+4rP09smp0comTH7n2VFjqTUHq9n2tUktxsswrgIv01EF6JA3qq/6nEG0LZ8/DYbZNCuZTeeZR5s7plC47kSq7DDBcWaz5BuOB7vLjwMuHe/NzoRKkfTqyI4hhEKqcOFyl0XXBVrZBbu7lLWUeGbpx4dUa0STw== Received: from [98.138.100.118] by nm7.bullet.mail.ne1.yahoo.com with NNFMP; 27 Apr 2015 18:44:22 -0000 Received: from [98.138.89.246] by tm109.bullet.mail.ne1.yahoo.com with NNFMP; 27 Apr 2015 18:44:22 -0000 Received: from [127.0.0.1] by omp1060.mail.ne1.yahoo.com with NNFMP; 27 Apr 2015 18:44:22 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 772515.76640.bm@omp1060.mail.ne1.yahoo.com X-YMail-OSG: ZoPdd9MVM1kBQ45B.7FK3cAka5dj7U1O_qiVJl4B1t_JPAlmiZgVobK5v236hq_ F8Iij58GnOi1SqX.SDt9xvML9VGeckHEnpJAHwrLONuSSP1HnxkBqiScTp8YXj_EYKaWXBKbrsRY W44BGAljykl3L.v5kMGvIdwggiZ9zxE1Je.gTe4u5eQ3zfxz2MtxHYfj_yj8nnXgzondAJzslSMN RU.bFrzzvAYqBgsQ_eAwcgtzLMI_ddlXw3EiLKOEs1mQmA3Alb9Ohg_Sjoo51dbkL68CeuAemuid ILR3iGiGv_i7XGXoMWUhSEh08fUXn9xaokvBCJSnHleLNn2MlnFplbnxe5AiSu8fYzFyyRzSAUX6 DizqRieBPObWNNBH8DZouP9yvgpWcUfClWnYVGK1IeZfJssNrPBy3yq8KHic8fFoE9UGASejws8Y rPhOfNPxto1x4I1fwdbHNHBMM3XnTRXJ0lKswkk_U.DqPiJRPxD7NPcyXPF5LQ5DgBbP6i1bGJgb qxw-- Received: by 98.138.105.244; Mon, 27 Apr 2015 18:44:22 +0000 Date: Mon, 27 Apr 2015 18:44:21 +0000 (UTC) From: REYANE OUKPEDJO Reply-To: REYANE OUKPEDJO To: User Hadoop Message-ID: <1196866035.5383488.1430160261768.JavaMail.yahoo@mail.yahoo.com> Subject: Yarn Localization failure on hadoop MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_5383487_398864816.1430160261764" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_5383487_398864816.1430160261764 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi there, I have =C2=A0an Application that is trying to launch an AM , but the locali= zation is failing with the below error message. The Resource visiblity is s= et to private which means the localizer will run through a container execut= or as a user that submit the job. I checked that hdfs dfs -ls / command is = working fine as a user sumbitting the Application to make sure kerberos cre= dentials for this user is not to be blamed. =C2=A0a mapreduce =C2=A0pi exam= ple could run successfully. Any idea what could cause this kind of issue? Thanks Reyane OUKPEDJO NODE MANAGER LOGS HERE=C2=A02015-04-27 22:30:48,682 INFO =C2=A0authorize.Se= rviceAuthorizationManager (ServiceAuthorizationManager.java:authorize(114))= - Authorization successful for testing (auth:TOKEN) for protocol=3Dinterfa= ce org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB2015= -04-27 22:30:49,463 INFO =C2=A0localizer.ResourceLocalizationService (Resou= rceLocalizationService.java:update(932)) - DEBUG: FAILED { hdfs://datanode5= .in.ibm.com:8020/user/dsadm/.staging/AppMaster.jar, 1430154129414, FILE, nu= ll }, Failed on local exception: java.io.IOException: org.apache.hadoop.sec= urity.AccessControlException: Client cannot authenticate via:[TOKEN, KERBER= OS]; Host Details : local host is: "datanode3.in.ibm.com/9.126.90.234"; des= tination host is: "datanode5.in.ibm.com":8020;2015-04-27 22:30:49,463 INFO = =C2=A0localizer.LocalizedResource (LocalizedResource.java:handle(196)) - Re= source hdfs://datanode5.in.ibm.com:8020/user/dsadm/.staging/AppMaster.jar t= ransitioned from DOWNLOADING to FAILED2015-04-27 22:30:49,463 INFO =C2=A0co= ntainer.Container (ContainerImpl.java:handle(901)) - Container container_14= 29925144518_0016_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAI= LED2015-04-27 22:30:49,464 INFO =C2=A0localizer.LocalResourcesTrackerImpl (= LocalResourcesTrackerImpl.java:handle(137)) - Container container_142992514= 4518_0016_01_000001 sent RELEASE event on a resource request { hdfs://datan= ode5.in.ibm.com:8020/user/dsadm/.staging/AppMaster.jar, 1430154129414, FILE= , null } not present in cache.2015-04-27 22:30:49,464 WARN =C2=A0nodemanage= r.NMAuditLogger (NMAuditLogger.java:logFailure(150)) - USER=3Ddsadm =C2=A0 = =C2=A0 =C2=A0 OPERATION=3DContainer Finished - Failed =C2=A0 TARGET=3DConta= inerImpl =C2=A0 =C2=A0RESULT=3DFAILURE =C2=A0DESCRIPTION=3DContainer failed= with state: LOCALIZATION_FAILED =C2=A0 =C2=A0APPID=3Dapplication_142992514= 4518_0016 =C2=A0 =C2=A0CONTAINERID=3Dcontainer_1429925144518_0016_01_000001= 2015-04-27 22:30:49,464 INFO =C2=A0container.Container (ContainerImpl.java:= handle(901)) - Container container_1429925144518_0016_01_000001 transitione= d from LOCALIZATION_FAILED to DONE2015-04-27 22:30:49,465 INFO =C2=A0applic= ation.Application (ApplicationImpl.java:transition(339)) - Removing contain= er_1429925144518_0016_01_000001 from application application_1429925144518_= 00162015-04-27 22:30:49,465 INFO =C2=A0monitor.ContainersMonitorImpl (Conta= inersMonitorImpl.java:isEnabled(169)) - Neither virutal-memory nor physical= -memory monitoring is needed. Not running the monitor-thread2015-04-27 22:3= 0:49,465 INFO =C2=A0containermanager.AuxServices (AuxServices.java:handle(1= 75)) - Got event CONTAINER_STOP for appId application_1429925144518_0016@ = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0=C2=A0 ------=_Part_5383487_398864816.1430160261764 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi th= ere,

I have  an Ap= plication that is trying to launch an AM , but the localization is failing = with the below error message. The Resource visiblity is set to private whic= h means the localizer will run through a container executor as a user that = submit the job. I checked that hdfs dfs -ls / command is working fine as a = user sumbitting the Application to make sure kerberos credentials for this = user is not to be blamed.  a mapreduce  pi example could run succ= essfully. Any idea what could cause this kind of issue?

Thanks

Reyane OUKPEDJO


NODE MANAGER LOGS HERE =
201= 5-04-27 22:30:48,682 INFO  authorize.ServiceAuthorizationManager (Serv= iceAuthorizationManager.java:authorize(114)) - Authorization successful for= testing (auth:TOKEN) for protocol=3Dinterface org.apache.hadoop.yarn.serve= r.nodemanager.api.LocalizationProtocolPB
2015-04-27 22:30:49,463 INFO &n= bsp;localizer.ResourceLocalizationService (ResourceLocalizationService.java= :update(932)) - DEBUG: FAILED { hdfs://datanode5.in.ibm.com:8020/user/dsadm= /.staging/AppMaster.jar, 1430154129414, FILE, null }, Failed on local excep= tion: java.io.IOException: org.apache.hadoop.security.AccessControlExceptio= n: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local h= ost is: "datanode3.in.ibm.com/9.126.90.234"; destination host is: "datanode= 5.in.ibm.com":8020;
2015-04-27 22:30:49,463 INFO  localizer.Localiz= edResource (LocalizedResource.java:handle(196)) - Resource hdfs://datanode5= .in.ibm.com:8020/user/dsadm/.staging/AppMaster.jar transitioned from DOWNLO= ADING to FAILED
2015-04-27 22:30:49,463 INFO  container.Container (= ContainerImpl.java:handle(901)) - Container container_1429925144518_0016_01= _000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
= 2015-04-27 2= 2:30:49,464 INFO  localizer.LocalResourcesTrackerImpl (LocalResourcesT= rackerImpl.java:handle(137)) - Container container_1429925144518_0016_01_00= 0001 sent RELEASE event on a resource request { hdfs://datanode5.in.ibm.com= :8020/user/dsadm/.staging/AppMaster.jar, 1430154129414, FILE, null } not pr= esent in cache.
2015-04-27 22:30:49,464 WARN  nodemanager.NMAuditLo= gger (NMAuditLogger.java:logFailure(150)) - USER=3Ddsadm     &nbs= p; OPERATION=3DContainer Finished - Failed   TARGET=3DContainerImpl &n= bsp;  RESULT=3DFAILURE  DESCRIPTION=3DContainer failed with state= : LOCALIZATION_FAILED    APPID=3Dapplication_1429925144518_0016 &= nbsp;  CONTAINERID=3Dcontainer_1429925144518_0016_01_000001
2015-0= 4-27 22:30:49,464 INFO  container.Container (ContainerImpl.java:handle= (901)) - Container container_1429925144518_0016_01_000001 transitioned from= LOCALIZATION_FAILED to DONE
2015-04-27 22:30:49,465 INFO  applicat= ion.Application (ApplicationImpl.java:transition(339)) - Removing container= _1429925144518_0016_01_000001 from application application_1429925144518_00= 16
2015-04-27 22:30:49,465 INFO &nbs= p;monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:isEnabled(169))= - Neither virutal-memory nor physical-memory monitoring is needed. Not run= ning the monitor-thread
2015-04-27 22:30:49,465 INFO  containermana= ger.AuxServices (AuxServices.java:handle(175)) - Got event CONTAINER_STOP f= or appId application_1429925144518_0016
@         &n= bsp;                     =                      = ;                     &nb= sp;                     &= nbsp;                    =                     &nbs= p;             

------=_Part_5383487_398864816.1430160261764--