zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Salmaso (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ZOOKEEPER-2464) NullPointerException on ContainerManager
Date Mon, 04 Jul 2016 22:21:11 GMT
Stefano Salmaso created ZOOKEEPER-2464:

             Summary: NullPointerException on ContainerManager
                 Key: ZOOKEEPER-2464
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.5.1
            Reporter: Stefano Salmaso

I would like to expose you to a problem that we are experiencing.
We are using a cluster of 7 zookeeper and we use them to implement a distributed lock using
Curator (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
So .. we tried to play with the servers to see if everything worked properly and we stopped
and start servers to see that the system worked well
(like stop 03, stop 05, stop 06, start 05, start 06, start 03)

We saw a strange behavior.
The number of znodes grew up without stopping (normally we had 4000 or 5000, we got to 60,000
and then we stopped our application)

In zookeeeper logs I saw this (on leader only, one every minute)

2016-07-04 14:53:50,302 [myid:7] - ERROR [ContainerManagerTask:ContainerManager$1@84] - Error
checking containers
       at org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
       at org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
       at org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
       at java.util.TimerThread.mainLoop(Timer.java:555)
       at java.util.TimerThread.run(Timer.java:505)

We have not yet deleted the data ... so the problem can be reproduced on our servers

This message was sent by Atlassian JIRA

View raw message