curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Nelson <>
Subject Re: protection on ephemeral nodes can go haywire
Date Wed, 15 Jan 2014 05:39:11 GMT
That is similar, but due to sequential (appending a number), rather than protected (prepending
a guid)

We have a lot of issues with connection flapping to zookeeper quorums, and I have some suspicion
this is happening when connections get reset.

I just cleared out a large number of servers that were in the "node name too long and client
gets an error on trying to create new node", so that the zookeeper server side log actually
has a vaguely legible signal:noise ratio and I can figure out how the ever expanding protected
node names starts. I certainly haven't ever been able to reproduce this in development!

On Jan 14, 2014, at 9:17 PM, Adarsh Bhat <<>>

CURATOR-56 sounds similar, but in a different recipe.

On Tuesday, January 14, 2014, Erik Nelson wrote:
Okay. The problem is, of course, that this happens in [heavily loaded] production. I'll see
if I can replicate in dev to get that test case.

On Jan 14, 2014, at 4:52 PM, Jordan Zimmerman <> wrote:

OK - please add an issue on Curator’s Jira and a test case.


From: Erik Nelson Erik Nelson
Reply: Erik Nelson
Date: January 14, 2014 at 4:52:21 PM
To: Jordan Zimmerman
Subject:  Re: protection on ephemeral nodes can go haywire
This is curator 2.3 and zookeeper 3.4.5-cdh4.3.2

On Jan 14, 2014, at 4:44 PM, Jordan Zimmerman <> wrote:

That looks like a bug to me. The GUID should only be present once. I remember there being
a bug like this a long time ago. What version are you using?


From: Erik Nelson Erik Nelson
Date: January 14, 2014 at 4:43:02 PM
Subject:  protection on ephemeral nodes can go haywire
I've noticed that it is quite common for my list of nodes to contain a bunch of entries that
look like:

node name]

sometimes even with the protection guid repeated again with the same actual node name.

This can get to the point where I start get exceptions on the client where it complains about
the node name needing to begin with '/'; I believe that the name exceeds the maximum size
and is being truncated.

Is this a known issue? My use of persistentephemeralnode isn't too fancy, and this happens
without any badness in the

View raw message