hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-1220) Yarn RM fs state store should handle safemode exceptions
Date Thu, 19 Sep 2013 00:24:52 GMT
Arpit Gupta created YARN-1220:

             Summary: Yarn RM fs state store should handle safemode exceptions
                 Key: YARN-1220
                 URL: https://issues.apache.org/jira/browse/YARN-1220
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 2.1.0-beta
            Reporter: Arpit Gupta
            Assignee: Vinod Kumar Vavilapalli

ons: 0
2013-09-18 05:41:13,542 ERROR recovery.RMStateStore (RMStateStore.java:handleStoreEvent(490))
- Error removing app: application_1379482521108_0003
Cannot delete /tmp/hadoop-yarn/yarn/system/rmstore/FSRMStateRoot/RMAppRoot/application_1379482521108_0003.
Name node is in safe mode.
The reported blocks 1018 has reached the threshold 1.0000 of total blocks 1018. The number
of live datanodes 5 has reached the minimum number 0. Safe mode will be turned off automatically
in 20 seconds.
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3124)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3083)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3067)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:491)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Clien

The issue here is that in case namenode is in safemode while we are interacting with fs state
store we wont be able to update the status. In this particular case the app was never removed
from the store and upon rm restart the app was recovered when it did not need to be.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message