zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Milne-Smith <a...@milne-smith.co.uk>
Subject Re: Snapshot containing partial transaction
Date Thu, 16 Jul 2015 12:21:46 GMT
So it seems that this is fine as long as we always backup the tranlogs as well.

Although I'm not sure whether there is a scenario that might lead to an error whilst replaying
the log file, where the transaction fails due to the partially updated/erroneous state that
the snapshot initialises.

With this in mind, our ZOOKEEPER-2141 patch may have to ignore some unexpected state rather
than throw an exception. We'll follow up on this at a later date.


On 16 Jul 2015 12:30, Flavio Junqueira wrote: > > (moving discussion to dev) > >
Hi Adam, > > I can't see a problem with your description about the snapshot generation,
but I would expect that replaying the transaction log would bring back the missing transactions.
We replay from the zxid in the snapshot name, which is taken before the snapshot starts (FileTxnSnapLog.save(...)).
> > -Flavio > > > On 16 Jul 2015, at 12:02, Adam Milne-Smith wrote: > >
> > I've created a jira ticket here: > > https://issues.apache.org/jira/browse/ZOOKEEPER-2234
> > > > Thanks, > > Adam > > > > On 15 Jul 2015 16:07, Adam
Milne-Smith wrote: > >> > >> Whilst writing a patch for ZOOKEEPER-2141 (3.4.6
branch), we spotted an ephemeral node that had not been deleted despite its session having
expired. Its ACL long did not exist in the ACL cache so any operation against this node will
fail. > >> > >> This could lead to things like curator locks never being
deleted (even after the timeout) and deadlocking applications. > >> > >>
We inspected the code and are reasonably certain that there are no bugs in updating the in-memory
data tree that could cause this. However serialising the snapshot happens asynchronously and
follows these 4 steps: > >> > >> -copy the sessions map > >> -serialise
the sessions map copy > >> -serialise the ACL map (synchronised) > >> -serialise
the data tree (synchronised at the individual node level) > >> > >> We suspect
the issue we are seeing is a new session and ephemeral node being created during the data
tree serialisation hence the corresponding session and acl are missing from the snapshot but
the node is present. This means the snapshot contains a partial transaction. > >>
> >> If we were to deserialise from this snapshot then the data in-memory would be
invalid. If one member of the quorum were to reboot and restore from this snapshot, it would
contain this node where the other hosts had removed it. If this host were to become the leader
and send its snapshot to other members of the quorum, those would have the invalid data too.
> >> > >> As far as we can see, the only way to delete this node when this
happens in production would be to perform manual surgery on the snapshot. > >> >
>> Can anyone confirm that they agree this to be the case or let us know if we've misunderstood
something? > >> > >> Thanks, > >> Adam >
View raw message