hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1094) RM restart throws Null pointer Exception in Secure Env
Date Sat, 24 Aug 2013 20:59:51 GMT

     [ https://issues.apache.org/jira/browse/YARN-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Vinod Kumar Vavilapalli updated YARN-1094:

    Attachment: YARN-1094-20130824.txt

Here's a patch that fixes this bug.
 - Moved delegationTokenRenewer's start to be explicit and before the state-store starts.
 - Made GetDelegationTokenRequest.newInstance as static. This was a pre-existing bug!
 - Made fixes to consistently use RMDelegationTokenRenewer only in secure mode
 - Some cosmetic changes to call tokenRenewer as more specifically delegationTokenRenewer

TestRMRestart.testDelegationTokenRestoredInDelegationTokenRenewer fails with the same NPE
without the code changes and passes with.

Also tested this on a single node secure setup where I first reproduced the NPE easily and
verified that RM restart works as expected after the patch.
> RM restart throws Null pointer Exception in Secure Env
> ------------------------------------------------------
>                 Key: YARN-1094
>                 URL: https://issues.apache.org/jira/browse/YARN-1094
>             Project: Hadoop YARN
>          Issue Type: Bug
>         Environment: secure env
>            Reporter: yeshavora
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Blocker
>         Attachments: YARN-1094-20130824.txt
> Enable rmrestart feature And restart Resorce Manager while a job is running.
> Resorce Manager fails to start with below error
> 2013-08-23 17:57:40,705 INFO  resourcemanager.RMAppManager (RMAppManager.java:recover(370))
- Recovering application application_1377280618693_0001
> 2013-08-23 17:57:40,763 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(617))
- Failed to load/recover state
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.setTimerForTokenRenewal(DelegationTokenRenewer.java:371)
>         at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addApplication(DelegationTokenRenewer.java:307)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:291)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:371)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:819)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:613)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:832)
> 2013-08-23 17:57:40,766 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting
with status 1

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message