accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Interesting bug report
Date Mon, 25 Jan 2016 17:14:53 GMT
I've long be waffling about the usefulness of our "infinite retry" 
logic. It's great for daemons. It sucks for humans.

Maybe there's a story in addressing this via ClientConfiguration -- let 
the user tell us the policy they want to follow.

John Vines wrote:
> Of course, it's when I hit send that I realize that we could mitigate by
> making the client aware of the master state, and if the system is shut down
> (which was the case for that ticket), then it can fail quickly with a
> descriptive message.
>
> On Mon, Jan 25, 2016 at 10:58 AM John Vines<vines@apache.org>  wrote:
>
>> While we want to be fault tolerant, there's a point where we want to
>> eventually fail. I know we have a couple never ending retry loops that need
>> to be addressed (https://issues.apache.org/jira/browse/ACCUMULO-1268),
>> but I'm unsure if queries suffer from this problem.
>>
>> Unfortunately, fault tolerance is a bit at odds with instant notification
>> of system issues, since some of the fault tolerance is temporally oriented.
>> And that ticket lacks context of it never failing out vs. failing out
>> eventually (but too long for the user)
>>
>>
>> On Sun, Jan 24, 2016 at 7:46 PM Christopher<ctubbsii@apache.org>  wrote:
>>
>>> I saw this bug report:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1300987
>>>
>>> As far as I can tell, they are reporting normal, expected, and desired
>>> behavior of Accumulo as a bug. But, is there something we can do upstream
>>> to enable fast failures in the case of Accumulo not running to support
>>> their use case?
>>>
>>> Personally, I don't see how we can reliably detect within the client that
>>> the cluster is down or up, vs. a normal temporary server outage/migration,
>>> since there is there is no single point of authority for Accumulo to
>>> determine its overall operating status if ZooKeeper is running and no
>>> other
>>> servers are. Am I wrong?
>>>
>

Mime
View raw message