hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3061) NPE in RM AppBlock render
Date Wed, 14 Jan 2015 14:34:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276956#comment-14276956
] 

Steve Loughran commented on YARN-3061:
--------------------------------------

in the source {{    RMAppAttemptMetrics attemptMetrics =
        rmApp.getCurrentAppAttempt().getRMAppAttemptMetrics();}}

clearly the app failed *before any app attempt was created*

The root cause looks like some token renewal thing probably caused by the VM save/resume,
related to kerberos renewal by the look of things

{code}
org.apache.slider.funtest.lifecycle.AgentWebPagesIT
testAgentWeb(org.apache.slider.funtest.lifecycle.AgentWebPagesIT)  Time elapsed: 194.768 sec
 <<< FAILURE!
java.lang.AssertionError: Application Launch Failure, exit code  65
Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, Service: 192.168.1.134:8188, Ident:
(owner=stevel, renewer=yarn, realUser=, issueDate=1421245210012, maxDate=1421850010012, sequenceNumber=11,
masterKeyId=6)
	at org.junit.Assert.fail(Assert.java:88)
	at org.apache.slider.funtest.framework.CommandTestBase.createTemplatedSliderApplication(CommandTestBase.groovy:691)
	at org.apache.slider.funtest.lifecycle.AgentWebPagesIT.testAgentWeb(AgentWebPagesIT.groovy:76)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

{code}

Server side
{code}
2015-01-14 14:20:16,993 ERROR metrics.SystemMetricsPublisher (SystemMetricsPublisher.java:putEntity(427))
- Error when publishing entity [YARN_APPLICATION,application_1420734007650_0010]
org.apache.hadoop.yarn.exceptions.YarnException: Failed to get the response from the timeline
server.
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:339)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:301)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.putEntity(SystemMetricsPublisher.java:425)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.publishApplicationCreatedEvent(SystemMetricsPublisher.java:258)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.handleSystemMetricsEvent(SystemMetricsPublisher.java:213)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher$ForwardingEventHandler.handle(SystemMetricsPublisher.java:442)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher$ForwardingEventHandler.handle(SystemMetricsPublisher.java:437)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
	at java.lang.Thread.run(Thread.java:745)
2015-01-14 14:20:35,026 INFO  impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285))
- Timeline service address: http://devix.cotham.uk:8188/ws/v1/timeline/
2015-01-14 14:20:35,766 WARN  security.DelegationTokenRenewer (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(785))
- Unable to add the application to the delegation token renewer.
java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, Service: 192.168.1.134:8188,
Ident: (owner=stevel, renewer=yarn, realUser=, issueDate=1421245210012, maxDate=1421850010012,
sequenceNumber=11, masterKeyId=6)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: HTTP status [401], message [Unauthorized]
	at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:169)
	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:286)
	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:211)
	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:414)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:394)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:380)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$4.run(TimelineClientImpl.java:449)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:162)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:464)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.renewDelegationToken(TimelineClientImpl.java:398)
	at org.apache.hadoop.yarn.security.client.TimelineDelegationTokenIdentifier$Renewer.renew(TimelineDelegationTokenIdentifier.java:81)
	at org.apache.hadoop.security.token.Token.renew(Token.java:377)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425)
	... 6 more
2015-01-14 14:20:36,169 INFO  rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(992))
- Updating application application_1420734007650_0010 with final state: FAILED
2015-01-14 14:20:36,185 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1420734007650_0010
State change from NEW to FINAL_SAVING
2015-01-14 14:20:36,490 INFO  recovery.RMStateStore (RMStateStore.java:transition(161)) -
Updating info for app: application_1420734007650_0010
2015-01-14 14:20:37,274 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(718)) - application_1420734007650_0010
State change from FINAL_SAVING to FAILED
{code}

I plan to fix all that by restarting the VM...the NPE in the web view is something that could
reoccur in similar circumstances

> NPE in RM AppBlock render
> -------------------------
>
>                 Key: YARN-3061
>                 URL: https://issues.apache.org/jira/browse/YARN-3061
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Varun Saxena
>            Priority: Minor
>
> An RM (running in a VM which did a sleep/resume) overnight no longer launches apps, and
when you try to look at the logs, Web UI says "500 look at the logs", which show a stack trace



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message