hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1405) RM hangs on shutdown if calling system.exit in serviceInit or serviceStart
Date Sun, 08 Dec 2013 11:01:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842474#comment-13842474
] 

Hudson commented on YARN-1405:
------------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk #415 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/415/])
YARN-1405. Fixed ResourceManager to not hang when init/start fails with an exception w.r.t
state-store. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1548992)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java


> RM hangs on shutdown if calling system.exit in serviceInit or serviceStart
> --------------------------------------------------------------------------
>
>                 Key: YARN-1405
>                 URL: https://issues.apache.org/jira/browse/YARN-1405
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 2.2.0
>            Reporter: Yesha Vora
>            Assignee: Jian He
>             Fix For: 2.4.0
>
>         Attachments: YARN-1405.1.patch, rm-threaddump.out
>
>
> Enable yarn.resourcemanager.recovery.enabled=true and Pass a local path to yarn.resourcemanager.fs.state-store.uri.
such as "file:///tmp/MYTMP"
> if the directory  /tmp/MYTMP is not readable or writable, RM should crash and should
print "Permission denied Error"
> Currently, RM throws "java.io.FileNotFoundException: File file:/tmp/MYTMP/FSRMStateRoot/RMDTSecretManagerRoot
does not exist" Error. RM returns Exiting status 1 but RM process does not shutdown. 
> Snapshot of Resource manager log:
> 2013-09-27 18:31:36,621 INFO  security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97))
- Rolling master-key for nm-tokens
> 2013-09-27 18:31:36,694 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640))
- Failed to load/recover state
> java.io.FileNotFoundException: File file:/tmp/MYTMP/FSRMStateRoot/RMDTSecretManagerRoot
does not exist
>         at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:379)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1478)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1518)
>         at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:564)
>         at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:188)
>         at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:112)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:635)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855)
> 2013-09-27 18:31:36,697 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting
with status 1



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message