curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Satish Duggana <satish.dugg...@gmail.com>
Subject Receiving KeeperException with NoNode when LeaderLatch#getLeader()
Date Fri, 18 Nov 2016 04:43:07 GMT
 Hi,
In some scenarios,
*org.apache.curator.framework.recipes.leader.LeaderLatch#getLeader()*
throws *KeeperException* with *Code#NONOD*E as mentioned in the stack trace
below. It may be possible  participant's ephemeral ZK node is removed
because its connection/session is closed.

You can see the below code at
https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderLatch.java#L451

public Participant getLeader() throws Exception
{
    Collection<String> participantNodes =
LockInternals.getParticipantNodes(client, latchPath, LOCK_NAME,
sorter);
    return LeaderSelector.getLeader(client, participantNodes);
}


I guess it hits a race condition where a participant node is retrieved but
when it invokes LeaderSelector#getLeader() it would have been removed
because of session timeout and it throws KeeperException with NoNode code.
It does not retry as the RetryLoop retries only for connection/session
timeouts. But in this case, NoNode should have been retried. I could not
find any APIs on CuratorClient to configure the kind of KeeperException
codes to be retried. It may be good to have a way to take what kind of
errors should be retried in
*org.apache.curator.framework.CuratorFrameworkFactory.Builder* APIs.

Intermittent Exception found with the stack trace:
2016-11-15 06:09:33.954 o.a.s.d.nimbus [ERROR] Error when processing event
org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/storm/leader-lock/_c_97c09eed-5bba-4ac8-a05f-abdc4e8e95cf-latch-0000000002
     at
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111)

     at
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

     at
org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)

     at
org.apache.storm.shade.org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:304)

     at
org.apache.storm.shade.org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:293)

     at
org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)

     at
org.apache.storm.shade.org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:290)

     at
org.apache.storm.shade.org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:281)

     at
org.apache.storm.shade.org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:42)

     at
org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderSelector.participantForPath(LeaderSelector.java:375)

     at
org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderSelector.getLeader(LeaderSelector.java:346)

     at
org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderLatch.getLeader(LeaderLatch.java:454)


Thanks,
Satish.

Mime
View raw message