zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Ziech <christian.zi...@nokia.com>
Subject Data loss for actions happening after a truncate in 3.4.3
Date Fri, 15 Jun 2012 16:38:24 GMT
This issue seems to only affect zookeeper 3.4.3 (and not 3.3.5). 
Basically it seems that after the truncate method is invoked, the 
logStream member of the FileTxnLog is still pointing to the old position 
in the file where it would have written the next entry before the 
truncate happened. Since the log file is not rolled over or the stream 
to reset, now a gap in the file is created, that would be interpreted 
when reading the log as an end of that file.

That means once this node becomes leader later on, it would send a 
snapshot to all its peer that only contains entries up to truncation - 
all entries thereafter would not be sent. We had this happening on a 
test cluster on 2/3 zookeeper servers while the network connection was 
bad. Even after the nodes recovered we would loose all the data every 
time the leader switches to one of those two nodes.

Furthermore (and that is a thing I could not 100% reproduce yet) it 
seems that there are some situations when the transaction log file would 
not only contain a gap but also just stop after the last entry before 
the truncation after some leader changes.

I have a small program that is able to reproduce the error safely for 
3.4.3 but not for 3.3.5. That seems to be related to the new leader in 
3.3.5 not sending the truncation message to the peer that was more 
advanced than the new leader, but the actual problem seems also be there 
in 3.3.5 (I just couldn't get the TRUNC message to be sent in my test).

Do other people have encountered the same issue already?

I will create a ticket with the test that reproduces the issue later, 
but before I will need to spend some more time on that script (things 
are a little hard to reproduce because I have to pull a zookeeper server 
out of the ensemble for some time without restarting it, to do so I'm 
using port-forwarding which I can interrupt even on localhost instead of 
direct connections).

What more information do you guys need to investigate the issue?

Mime
View raw message