manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aeham Abushwashi (JIRA)" <>
Subject [jira] [Created] (CONNECTORS-1123) ZK node leak
Date Wed, 17 Dec 2014 16:51:13 GMT
Aeham Abushwashi created CONNECTORS-1123:

             Summary: ZK node leak
                 Key: CONNECTORS-1123
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework core
    Affects Versions: ManifoldCF 1.8
         Environment: 4-node manifold cluster, 3-node ZK ensemble for coordination and configuration
            Reporter: Aeham Abushwashi

Looking at the stats of the zookeeper cluster, I was struck by the very high node count reported
by the ZK stat command  which shows there being just over 3.84 MILLION nodes. The number keeps
rising as long as the manifold nodes are running. Stopping manifold does NOT reduce the number
significantly, nor does restarting the ZK ensemble.

The ZK ensemble was initialised around 20 days ago. Manifold has been running on and off on
this cluster since that time.

The flat nature of the manifold node structure in ZK (at least in the dev_1x branch) makes
it difficult to identify node names but after tweaking the jute.maxbuffer parameter on the
client, I was able to get a list of all nodes. There's a huge number of nodes with the name
pattern org.apache.manifoldcf.locks-<Output Connection>:<Hash>. 
I could see using this node name pattern used in IncrementalIngester#documentDeleteMultiple
and IncrementalIngester#documentRemoveMultiple. However, I'm not expecting any deletions in
the tests I've been running recently - perhaps this is part of the duplicate deletion logic
which came up in an email thread earlier today? or maybe there's another code path I missed
entirely and which creates nodes with names like the above.

This message was sent by Atlassian JIRA

View raw message