curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: Creating PersistentEphemeralNode when noauth leads to cycle
Date Wed, 05 Aug 2015 22:26:50 GMT
hey Austin,
I think that what you're suggesting is reasonable, but the logistics of
implementing it are (I believe) somewhat complicated. Other devs, would be
good if you could confirm, but here's my take (largely just thinking out
loud).

The retry loop is tied to a Curator client, not to a particular recipe, and
the retry loop only retries on 'retryable' errors. Retryable errors are
those which are considered transient (connection loss etc.) and, NOAUTH is
not one of these errors. This is why you see Curator spinning in a tight
loop (CURATOR-228), because it submits the attempt to create a child, gets
a NOAUTH error, doesn't retry (because it's not retryable), then the
PersistentEphemeralNode code just attempts to retry creation immediately
and the process repeats.

Having the retry happen for 'non retryable' errors (like NOAUTH) would I
believe require spawning a new thread, as it can't be handled by the normal
retry process. I was trying to avoid this via just watching the parent node
for updates, but as mentioned in the previous email, watches don't get
triggered on ACL changes, so this doesn't help.

This is the only way that I can think of solving the problem. More than
happy to hear other suggestions!
cheers
Cam


On Thu, Aug 6, 2015 at 1:17 AM, Miller, Austin <
Austin.Miller@morganstanley.com> wrote:

> Hi Cameron,
>
>
>
> I’m very much for adding better logging to the pattern.  And, better
> logging, alone, represents an improvement.
>
>
>
> If by handle you mean, exits the retry loop?  This would be a non-starter
> for us because it would imply that one has to bounce the process or
> introduce extraordinary code at a higher level in order to recover
> dynamically, an essential feature in our async world.  We would have to
> take one of two actions to handle such a code change in curator, monkey
> patching the class or writing our own version.
>
>
>
> Curator is fundamentally async and the retry during configuration seems to
> implicitly say, while we’re async, we will keep trying and we won’t hammer
> your servers.  It seems that exiting the loop or continuing to hammer
> servers both represent inconsistency with Curator’s value proposition.
>
>
>
> If watches do not fire on ACL changes, then I think that the change I
> would be most interested in seeing adopted is to delay the next attempt by
> the pluggable retry policy.
>
>
>
> Regards,
>
> Austin
>
>
>
> *From:* Cameron McKenzie [mailto:mckenzie.cam@gmail.com]
> *Sent:* Sunday, August 02, 2015 4:37 PM
> *To:* Miller, Austin (ICT)
> *Cc:* dev@curator.apache.org; Gupta, Abhishek (ICT)
>
> *Subject:* Re: Creating PersistentEphemeralNode when noauth leads to cycle
>
>
>
> I've spent a bit more time looking at this and I haven't come up with a
> good solution.
>
>
>
> I was originally thinking that we could just watch the parent node in the
> NOAUTH case and then retry creation if the parent node changes.
> Unfortunately, watches don't fire when the ACL data of the node changes (I
> don't know why this is, but it's how Zookeeper works).
>
>
>
> This means that we're stuck with some sort of periodic retry mechanism,
> and I'm fairly loathe to introduce some arbitrary retry, as I think that
> ultimately this NOAUTH case is 'user error'.
>
>
>
> So, unless anyone has any better suggestions, I was just going to modify
> the recipe so that it handles any non retryable errors with a warning log.
>
> cheers
>
>
>
>
>
>
>
> On Fri, Jul 24, 2015 at 7:58 AM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
>
> hey Ausin,
>
> All reasonable suggestions, I will have a think about it and see what we
> can come up with. As far as parent nodes not existing, the
> PersistentEphemeralNode recipe uses 'creatingParentsIfNecessary', so that's
> not an issue.
>
>
>
> It would just be a matter of detecting a NOAUTH error, then placing a
> watch on the parent node and retrying whenever its state changed (either
> the auth details change or the node itself gets deleted). We would still
> want to log something I think, as I would say that in 99% of cases this is
> a user error.
>
>
>
> I agree that tying the retry back to the policy is probably a good idea
> too.
>
> cheers
>
>
>
> On Fri, Jul 24, 2015 at 2:54 AM, Miller, Austin <
> Austin.Miller@morganstanley.com> wrote:
>
> Hi Cameron,
>
>
>
> Looking over the comments, the proposed change, to exit the loop by
> return, seems like a net negative.  While it prevents hammering the server,
> it seems to do so by violating the PEN contract, which I understood to be
> “If creating the node can eventually succeed, it will succeed.”  Now with
> the caveat that retry for auth issues is up to the user.
>
>
>
> The notion of using a watch on the parent node to wait for noauth to not
> be an issue seems to be much superior, because it would allow the contract
> to remain, eventually the PEN will succeed if it can succeed, while also
> not hammering the server.   I recognize the complexity would increase as
> the need for doing things in the right order to prevent race conditions
> would have to be properly thought through.
>
>
>
> What happens if the parent node doesn’t exist?
>
>
>
> Ultimately, could we tie failure back into the config retry somehow?  So
> that you can get exponential backoff retry behavior in PEN?
>
>
>
> Austin
>
>
>
> *From:* Cameron McKenzie [mailto:mckenzie.cam@gmail.com]
> *Sent:* Wednesday, July 22, 2015 7:40 PM
> *To:* dev@curator.apache.org
> *Cc:* Gupta, Abhishek (ICT); Miller, Austin (ICT)
> *Subject:* Re: Creating PersistentEphemeralNode when noauth leads to cycle
>
>
>
> This is definitely an issue, seems like it's the same problem as
> CURATOR-228. I can reproduce it, will look at a fix when I get a minute.
>
>
>
> On Wed, Jul 22, 2015 at 7:48 PM, Szekeres, Zoltan <
> Zoltan.Szekeres@morganstanley.com> wrote:
>
> Hi Curator Dev!
>
> We've encountered an issue when creating a PersistentEphemeralNode and the
> authentication fails for the given path. We received a NOAUTH result code,
> which the BackgroundCallback's processResult doesn't handle and tries to
> create the node again. This leads to a cycle. This is an issue for us,
> because it can generate thousands of requests per minute.
>
> We use curator 2.4.2 and zookeeper 3.4.6.
>
> A possibly related JIRA:
>
> https://issues.apache.org/jira/i#browse/CURATOR-228?jql=text%20~%20%22PersistentEphemeralNode%22
>
> Do you see this as an issue?
>
> Best Regards,
> Zoltan Szekeres
> Morgan Stanley | Institutional & Corporate Tech
> Lechner Odon fasor 8 | Floor 05
> Budapest, 1095
> Phone: +36 1 881-4657
> Zoltan.Szekeres@morganstanley.com<mailto:Zoltan.Szekeres@morganstanley.com
> >
>
>
>
> ________________________________
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>
>
>
>
> ------------------------------
>
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>
>
>
>
>
>
> ------------------------------
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message