curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jordan Zimmerman (JIRA)" <>
Subject [jira] [Commented] (CURATOR-311) SharedValue could hold stall data when quourm membership changes
Date Tue, 29 Mar 2016 01:25:25 GMT


Jordan Zimmerman commented on CURATOR-311:

I was review the SharedValue code and am shocked to see that it read's the value in the Watcher
callback. This is a big no-no. I don't know if it's the source of the problem, but it needs
to be fixed.

> SharedValue could hold stall data when quourm membership changes
> ----------------------------------------------------------------
>                 Key: CURATOR-311
>                 URL:
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 3.1.0
>         Environment: Linux
>            Reporter: Jian Fang
> We run a Zookeeper 3.5.1-alpha quorum on EC2 instances and the quorum members could be
changed, for example, one peer could be replaced by a new EC2 instance due to EC2 instance
termination. We use Apache Curator 3.1.0 as the zookeeper client. During our testing, we found
the SharedValue data structure could hold stall data during and after one peer is replaced
and thus led to the system failure. 
> We look into the SharedValue code. Seems it always returns the value from an in-memory
reference variable and the value is only updated by a watcher. If for any reason, the watch
is lost, then the value would never get a chance to be updated again.
> Right now, we added a connection state listener to force SharedValue to call readValue(),
i.e., read the data from zookeeper directly, if the connection state has been changed to RECONNECTED
to work around this issue.
> It would be great if this issue could be fixed in Curator directly.

This message was sent by Atlassian JIRA

View raw message