curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: CURATOR-79
Date Fri, 08 Aug 2014 21:58:29 GMT
The proposed alternative is pretty much the same thing, but moving the
logic to the LockInternals class. The added complexity of this is that the
LockInternals class doesn't know what the name of the zNode that was going
to be created is called. We'd need to expose the generated UUID on the
CreateBuilder.

My preference is to fix this in the protected() code. We can either just
handle the InterruptedException, or handle any non KeeperException (other
than ConnectionLoss). The InterruptedException is what's happening in this
case, but if we get an NPE or something in that code, the same thing would
occur, so I think that perhaps we should just have a blanket catch all.

I have coded this up already and all of the unit tests pass ok.
cheers


On Fri, Aug 8, 2014 at 11:05 PM, Mike Drob <madrob@cloudera.com> wrote:

> Explicitly coding for a possible InterruptedException sounds good to me,
> but we already know that I prefer (more) diverse Exception types.
>
> I'm not sure I understand what the proposed alternative is?
>
> Mike
>
>
> On Fri, Aug 8, 2014 at 1:02 AM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
>
> > Guys,
> > I've been looking into a fix for CURATOR-79 (
> > https://issues.apache.org/jira/browse/CURATOR-79) and have found it to
> be
> > slightly more complicated than initially expected.
> >
> > The locking recipes are using protected zNodes (i.e the zNode name
> contains
> > a random UUID that is tied to a particular builder instance) for locks,
> > which is sensible, but there seems to be an issue with this.
> >
> > The protected logic basically looks for the cause of failure on a create,
> > and if it's connection loss, then it does an ensured deleted on the path
> it
> > was trying to create to ensure that it's removed if it did get created.
> >
> > For CURATOR-79, and InterruptedException is causing this call to fail
> when
> > waiting for the response from ZK. This means that the protected logic
> does
> > not fire and we end up with an orphaned node.
> >
> > It's possible with some ugliness to handle this in the InterprocesMutex,
> > but I think that maybe it's better fixed in the protected logic. Maybe
> the
> > protected logic could be modified so that it will occur on ConnectionLoss
> > or on any non-KeeperException (i.e. InterruptedException). This would
> cause
> > the zNode to be removed if it was created, and would fix this deadlock
> > issue.
> >
> > I would welcome anyone's opinion on the way forward.
> > cheers
> > Cam
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message