accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Interesting bug report
Date Tue, 26 Jan 2016 16:45:57 GMT
On Mon, Jan 25, 2016 at 12:14 PM, Josh Elser <> wrote:

> I've long be waffling about the usefulness of our "infinite retry" logic.
> It's great for daemons. It sucks for humans.
> Maybe there's a story in addressing this via ClientConfiguration -- let
> the user tell us the policy they want to follow.

+1 for configurable retry policy.    Curator has a configurable retry
policy.  Would be good to see how it works when designing something for

> John Vines wrote:
>> Of course, it's when I hit send that I realize that we could mitigate by
>> making the client aware of the master state, and if the system is shut
>> down
>> (which was the case for that ticket), then it can fail quickly with a
>> descriptive message.
>> On Mon, Jan 25, 2016 at 10:58 AM John Vines<>  wrote:
>> While we want to be fault tolerant, there's a point where we want to
>>> eventually fail. I know we have a couple never ending retry loops that
>>> need
>>> to be addressed (,
>>> but I'm unsure if queries suffer from this problem.
>>> Unfortunately, fault tolerance is a bit at odds with instant notification
>>> of system issues, since some of the fault tolerance is temporally
>>> oriented.
>>> And that ticket lacks context of it never failing out vs. failing out
>>> eventually (but too long for the user)
>>> On Sun, Jan 24, 2016 at 7:46 PM Christopher<>  wrote:
>>> I saw this bug report:
>>>> As far as I can tell, they are reporting normal, expected, and desired
>>>> behavior of Accumulo as a bug. But, is there something we can do
>>>> upstream
>>>> to enable fast failures in the case of Accumulo not running to support
>>>> their use case?
>>>> Personally, I don't see how we can reliably detect within the client
>>>> that
>>>> the cluster is down or up, vs. a normal temporary server
>>>> outage/migration,
>>>> since there is there is no single point of authority for Accumulo to
>>>> determine its overall operating status if ZooKeeper is running and no
>>>> other
>>>> servers are. Am I wrong?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message