hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-960) TestMRCredentials and TestBinaryTokenFile are failing on trunk
Date Thu, 25 Jul 2013 00:07:49 GMT

    [ https://issues.apache.org/jira/browse/YARN-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719066#comment-13719066
] 

Alejandro Abdelnur commented on YARN-960:
-----------------------------------------

LGTM. Still with this patch I cannot get the pi example to work in a speudo setup, the localization
of the AM is failing with:

{code}
2013-07-24 16:58:19,057 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful
for appattempt_1374710243541_0001_000002 (auth:SIMPLE)
2013-07-24 16:58:19,061 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Start request for container_1374710243541_0001_02_000001 by user tucu
2013-07-24 16:58:19,061 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=tucu
IP=172.21.3.149	OPERATION=Start Container Request	TARGET=ContainerManageImpl	RESULT=SUCCESS
APPID=application_1374710243541_0001	CONTAINERID=container_1374710243541_0001_02_000001
2013-07-24 16:58:19,061 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1374710243541_0001_02_000001 to application application_1374710243541_0001
2013-07-24 16:58:19,062 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1374710243541_0001_02_000001 transitioned from NEW to LOCALIZING
2013-07-24 16:58:19,062 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://localhost:8020/tmp/hadoop-yarn/staging/tucu/.staging/job_1374710243541_0001/job.jar
transitioned from INIT to DOWNLOADING
2013-07-24 16:58:19,062 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Created localizer for container_1374710243541_0001_02_000001
2013-07-24 16:58:19,109 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Writing credentials to the nmPrivate file /tmp/hadoop-tucu/nm-local-dir/nmPrivate/container_1374710243541_0001_02_000001.tokens.
Credentials list: 
2013-07-24 16:58:19,130 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Initializing user tucu
2013-07-24 16:58:19,255 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Copying from /tmp/hadoop-tucu/nm-local-dir/nmPrivate/container_1374710243541_0001_02_000001.tokens
to /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001/container_1374710243541_0001_02_000001.tokens
2013-07-24 16:58:19,256 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
CWD set to /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001
= file:/tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001
2013-07-24 16:58:19,691 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { hdfs://localhost:8020/tmp/hadoop-yarn/staging/tucu/.staging/job_1374710243541_0001/job.jar,
1374710294773, PATTERN, (?:classes/|lib/).* }, rename destination /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001/filecache/12
already exists.
2013-07-24 16:58:19,692 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://localhost:8020/tmp/hadoop-yarn/staging/tucu/.staging/job_1374710243541_0001/job.jar
transitioned from DOWNLOADING to FAILED
2013-07-24 16:58:19,692 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1374710243541_0001_02_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2013-07-24 16:58:19,693 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1374710243541_0001_02_000001 sent RELEASE event on a resource request
{ hdfs://localhost:8020/tmp/hadoop-yarn/staging/tucu/.staging/job_1374710243541_0001/job.jar,
1374710294773, PATTERN, (?:classes/|lib/).* } not present in cache.
2013-07-24 16:58:19,694 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path : /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001/container_1374710243541_0001_02_000001
2013-07-24 16:58:19,694 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Unknown localizer with localizerId container_1374710243541_0001_02_000001 is sending heartbeat.
Ordering it to DIE
2013-07-24 16:58:19,694 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
delete returned false for path: [/tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001/container_1374710243541_0001_02_000001]
2013-07-24 16:58:19,694 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=tucu
OPERATION=Container Finished - Failed	TARGET=ContainerImpl	RESULT=FAILURE	DESCRIPTION=Container
failed with state: LOCALIZATION_FAILED	APPID=application_1374710243541_0001	CONTAINERID=container_1374710243541_0001_02_000001
2013-07-24 16:58:19,694 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1374710243541_0001_02_000001 transitioned from LOCALIZATION_FAILED to
DONE
2013-07-24 16:58:19,695 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Removing container_1374710243541_0001_02_000001 from application application_1374710243541_0001
2013-07-24 16:58:19,695 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
ResourceCalculatorPlugin is unavailable on this system. org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
is disabled.
2013-07-24 16:58:19,695 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send rpc
request to server
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
	at java.util.concurrent.FutureTask.get(FutureTask.java:83)
	at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
	at org.apache.hadoop.ipc.Client.call(Client.java:1401)
	at org.apache.hadoop.ipc.Client.call(Client.java:1381)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy25.heartbeat(Unknown Source)
	at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:250)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:164)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:107)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:980)
2013-07-24 16:58:20,050 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Sending out status for container: container_id {, app_attempt_id {, application_id {, id:
1, cluster_timestamp: 1374710243541, }, attemptId: 2, }, id: 1, }, state: C_COMPLETE, diagnostics:
"rename destination /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001/filecache/12
already exists.\n", exit_status: -1000, 
2013-07-24 16:58:20,050 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Removed completed container container_1374710243541_0001_02_000001
2013-07-24 16:58:21,057 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1374710243541_0001 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2013-07-24 16:58:21,058 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path : /tmp/hadoop-tucu/nm-local-dir/usercache/tucu/appcache/application_1374710243541_0001
2013-07-24 16:58:21,058 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices:
Got event APPLICATION_STOP for appId application_1374710243541_0001
2013-07-24 16:58:21,061 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1374710243541_0001 transitioned from APPLICATION_RESOURCES_CLEANINGUP
to FINISHED
2013-07-24 16:58:21,061 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler:
Scheduling Log Deletion for application: application_1374710243541_0001, with delay of 10800
seconds
{code}

Wonder if this is related to the fallout due to token changes.
                
> TestMRCredentials and  TestBinaryTokenFile are failing on trunk
> ---------------------------------------------------------------
>
>                 Key: YARN-960
>                 URL: https://issues.apache.org/jira/browse/YARN-960
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Alejandro Abdelnur
>            Assignee: Daryn Sharp
>            Priority: Blocker
>             Fix For: 2.1.0-beta
>
>         Attachments: YARN-960.patch
>
>
> Not sure, but this may be a fallout from YARN-701 and/or related to YARN-945.
> Making it a blocker until full impact of the issue is scoped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message