accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Is it safe / advisable to increase Zookeeper timeout?
Date Fri, 07 Mar 2014 17:15:59 GMT
On 3/7/14, 12:01 PM, Terry P. wrote:
> Greetings folks,
> It seems network woes will never go away for this Accumulo 1.4.2 project :-(
>
> They rebooted one of the two "redundant switches" last night, but of
> course zero redundancy actually took place and the Master lost his
> zookeeper lock as did one of the Datanodes after 60 seconds and shut
> itself down.

By datanode you mean tserver? Hadoop datanodes don't communicate with 
ZooKeeper.

> The 60 second period is odd, because I see that
> instance.zookeeper.timeout is actually set to 30s, but I do recall that
> often by default zookeeper clients retry 2 times before bailing so maybe
> that's why.

It won't always be 30s before it's seen; I've seen it much quicker too. 
I'm not sure about the retries off the top of my head.

> My question: is it safe / advisable to increase the zookeeper timeout
> to, say, 60 seconds?  Where can I set that in a file to ensure the
> change is durable?

Yes, as long as you're cognizant of the fact that it will take longer to 
notice an actual failure. If a tserver dies/hangs, you could now 
potentially take twice as long to realize this which would cause latency 
in your application.

You should set that property in accumulo-site.xml. Make sure to place it 
on all nodes in the cluster. I believe you will also have to restart 
Accumulo for it to take effect.

> Thanks in advance,
> Tery

Mime
View raw message