curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject CURATOR-79
Date Fri, 08 Aug 2014 06:02:20 GMT
Guys,
I've been looking into a fix for CURATOR-79 (
https://issues.apache.org/jira/browse/CURATOR-79) and have found it to be
slightly more complicated than initially expected.

The locking recipes are using protected zNodes (i.e the zNode name contains
a random UUID that is tied to a particular builder instance) for locks,
which is sensible, but there seems to be an issue with this.

The protected logic basically looks for the cause of failure on a create,
and if it's connection loss, then it does an ensured deleted on the path it
was trying to create to ensure that it's removed if it did get created.

For CURATOR-79, and InterruptedException is causing this call to fail when
waiting for the response from ZK. This means that the protected logic does
not fire and we end up with an orphaned node.

It's possible with some ugliness to handle this in the InterprocesMutex,
but I think that maybe it's better fixed in the protected logic. Maybe the
protected logic could be modified so that it will occur on ConnectionLoss
or on any non-KeeperException (i.e. InterruptedException). This would cause
the zNode to be removed if it was created, and would fix this deadlock
issue.

I would welcome anyone's opinion on the way forward.
cheers
Cam

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message