curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian Fang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CURATOR-311) SharedValue could hold stall data when quourm membership changes
Date Tue, 22 Mar 2016 18:46:25 GMT
Jian Fang created CURATOR-311:
---------------------------------

             Summary: SharedValue could hold stall data when quourm membership changes
                 Key: CURATOR-311
                 URL: https://issues.apache.org/jira/browse/CURATOR-311
             Project: Apache Curator
          Issue Type: Bug
          Components: Recipes
    Affects Versions: 3.1.0
         Environment: Linux
            Reporter: Jian Fang


We run a Zookeeper 3.5.1-alpha quorum on EC2 instances and the quorum members could be changed,
for example, one peer could be replaced by a new EC2 instance due to EC2 instance termination.
We use Apache Curator 3.1.0 as the zookeeper client. During our testing, we found the SharedValue
data structure could hold stall data during and after one peer is replaced and thus led the
system failure. 

I look at the SharedValue code and seems it always returns the value from an in-memory reference
variable and the value is only updated by a watcher. If for any reason, the watch is lost,
then the value would never get a chance to be updated again.
 
Right now, I added a connection state listener to force SharedValue to call readValue(), i.e.,
read the data from zookeeper directly, if the connection state has been changed to RECONNECTED
to work around this issue.

It would be great if this issue could be fixed in Curator directly.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message