accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frans Lawaetz <flawa...@gmail.com>
Subject Re: Is it safe / advisable to increase Zookeeper timeout?
Date Mon, 10 Mar 2014 15:03:04 GMT
On Fri, Mar 7, 2014 at 12:15 PM, Josh Elser <josh.elser@gmail.com> wrote:

> On 3/7/14, 12:01 PM, Terry P. wrote:
>
>> Greetings folks,
>> It seems network woes will never go away for this Accumulo 1.4.2 project
>> :-(
>>
>> They rebooted one of the two "redundant switches" last night, but of
>> course zero redundancy actually took place and the Master lost his
>> zookeeper lock as did one of the Datanodes after 60 seconds and shut
>> itself down.
>>
>
> By datanode you mean tserver? Hadoop datanodes don't communicate with
> ZooKeeper.
>
>
>  The 60 second period is odd, because I see that
>> instance.zookeeper.timeout is actually set to 30s, but I do recall that
>> often by default zookeeper clients retry 2 times before bailing so maybe
>> that's why.
>>
>
> It won't always be 30s before it's seen; I've seen it much quicker too.
> I'm not sure about the retries off the top of my head.


Most likely you were seeing the effects of ACCUMULO-1572 in which a
ZooKeeper disconnect causes Accumulo failure before the expiration of the
session.  Fixed in 1.5.1 and to-be-released 1.4.5.  If you think you're
seeing something else it would be good to hear about it.

Mime
View raw message