zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hendrik Haddorp <hendrik.hadd...@gmx.net>
Subject Re: recover failed node
Date Wed, 25 Jan 2017 06:48:05 GMT
Hi Ben,
my setup is running on docker. The work directory is mounted as a docker 
volume and that got lost. Just the config was left. Given that all ports 
and host names did not change I actually did not expect any 
communication problems. But looking into the logs again as you suggested 
I actually found that the healthy node could not reach the node that had 
failed. We actually had an addition problem with the docker host of that 
machine, which is also why the volume was lost, and it looks like the 
DNS lookup had a problem. So after I restarted one of the good nodes 
ZooKeeper recovered now again and all nodes are good again :-)

thanks,
Hendrik

On 25.01.2017 01:34, Ben Sherman wrote:
> Do you know why the node lost its data?  Are your configuration files
> correct?  Is is trying to join the ensemble?  Are there any mentions of the
> broken node trying to reach the good nodes in the good nodes' logs?
>
> On Tue, Jan 24, 2017 at 1:06 PM, Hendrik Haddorp <hendrik.haddorp@gmx.net>
> wrote:
>
>> Hi,
>>
>> I assume this is quite a standard issue but I failed to find a solution so
>> far. I have a 3 node ZooKeeper 3.4.6 ensemble and one node lost all its
>> data. My assumption was that when the node comes up again ZooKeeper would
>> send over the state from the remaining nodes to reinitialize it but that
>> does not seem to happen. So what can I do to recover my node without
>> changing the two left nodes? I tried to copy the snapshots and logs from
>> one node but that did not work so far.
>>
>> thanks,
>> Hendrik
>>


Mime
View raw message