hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-15116) NPE in ResourceManager when ZooKeeper goes down temporary (HA Mode)
Date Wed, 13 Dec 2017 14:33:00 GMT
Sunil G created HADOOP-15116:
--------------------------------

             Summary: NPE in ResourceManager when ZooKeeper goes down temporary (HA Mode)
                 Key: HADOOP-15116
                 URL: https://issues.apache.org/jira/browse/HADOOP-15116
             Project: Hadoop Common
          Issue Type: Bug
          Components: ha
    Affects Versions: 3.0.0-beta1
            Reporter: Sunil G


In an HA enabled cluster (3.0), we found that RM is failing to start with an NPE from ActiveStandbyElector.
Zookeeper was down at this time, hence client retries were coming for a while

{code}
2017-12-13 18:21:22,460 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to
server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-12-13 18:21:22,544 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService
failed in state INITED; cause: java.lang.NullPointerException
java.lang.NullPointerException
        at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1039)
        at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1036)
        at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1101)
        at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1093)
        at org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:1036)
        at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:347)
       at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.serviceInit(ActiveStandbyElectorBasedElectorService.java:110)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:326)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1420)
2017-12-13 18:21:22,545 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election
2017-12-13 18:21:22,545 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager
failed in state INITED; cause: java.lang.NullPointerException
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Mime
View raw message