zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@apache.org>
Subject Re: How to handle zookeeper data inconsistency
Date Thu, 21 Jan 2016 15:39:27 GMT
I should have added that you can use sync() followed by exists() if you want to catch old deletes
that are still propagating. This is supposed to flush the pending updates from the leader.

-Flavio

> On 21 Jan 2016, at 15:34, Flavio Junqueira <fpj@apache.org> wrote:
> 
> 
>> On 21 Jan 2016, at 15:03, Mohammad arshad <mohammad.arshad@huawei.com> wrote:
>> 
>> Thanks for Flavio Junqueira for your response.
>> 
>> assume C received the commit request but before committing it failed, When C will
be synced? What event will at leader or follower will synch it up.
> 
> If C failed, then when it comes back up, it will sync up with the leader and learn everything
that has been committed. This is part of the recovery process. Even though there are multiple
steps, you can assume that once C is back online, it will have reflected in its state all
committed have previously committed, excepted for the ones that are still in-flight.
> 
>> 
>> Here is another scenario we faced.
>> Node got deleted successfully in leader node B. But due to network issue in Master
node, the delete could not sync up to follower A and C.  At this moment, Leader node also
goes down as faulty. 
>> 
> 
> If B successfully committed the delete operation, even if the commit message didn't go
out, then it means that at least another node got the proposal. In your 3-server ensemble,
a quorum has size 2, so any proposal needs to be persisted and acknowledged by a quorum before
it is committed.
> 
>> Now one of the A and C becomes leader but it has inconsistent data. ( delete is not
executed here)
>> 
> 
> It will be executed there because the new leader, A or C, needs to commit the initial
state of the new epoch and it will do it based on its log state, which will include the delete
operation.
> 
>> As I know, This behavior is fine as per current ZK design. But to solve above data
inconsistency issue, any suggestions ? I thought to commit the delete not only in leader but
to at least in N/2 nodes in the same client call and then only mark delete as successful
> 
> No, not fine. If a quorum has acknowledged a txn, then we guarantee that the corresponding
operation is durable. The thing that is ok as per ZK design is that the delete operation is
acknowledged, and a particular server, say C, only receives it a little later. In this case,
it could happen that a client reads the ZK state but misses the delete. However, if the client
keeps reading, then it should eventually see the delete.
> 
> Another thing that is fine is that if no quorum acknowledges a txn, then the txn isn't
durable. 
> 
> -Flavio
> 
>> 
>> -----Original Message-----
>> From: Flavio Junqueira [mailto:fpj@apache.org] 
>> Sent: 21 January 2016 19:11
>> To: user@zookeeper.apache.org
>> Cc: dev
>> Subject: Re: How to handle zookeeper data inconsistency
>> 
>> Hi Mohammad,
>> 
>> A delete operation only needs to reach a quorum to complete and A B form a quorum
in your 3-server ensemble. If the delete operation never gets propagated to C and other write
operations that have been ordered later complete on C, then you have an issue. If C simply
stops receiving updates, then you have a problem with your C server and it could be a problem
with ZK or just the environment.
>> 
>> If there has been write operations ordered after the delete and server C has seen
those but not the delete, then I'd recommend that you have a look the txn logs with the log
formatter.
>> 
>>> shall I check exists from leader only? but even leader can have some 
>>> node undeleted in the above scenario
>> 
>> There is no such a requirement, but you need to be aware that server C could definitely
make an update visible later compared to other servers. ZooKeeper doesn't guarantee that updates
are visible to all clients as soon as they are acknowledged.
>> 
>> I'd also search for jiras, especially if you're deleting an ephemeral. 
>> 
>> -Flavio
>> 
>>> On 21 Jan 2016, at 13:24, Mohammad arshad <mohammad.arshad@huawei.com>
wrote:
>>> 
>>> Hi All,
>>> I came across a scenario where zookeeper was left in inconsistent 
>>> state(but that is valid as per the zookeeper theory) and because of 
>>> that dependent application's behaved wrongly The scenario is as follow
>>> 
>>> 1) I have three server Zookeeper cluster, let's say servers are A, B 
>>> and C. B is the leader
>>> 2) In one successful delete operation, a znode znode1 was deleted from A and
B but somehow not deleted from C. The reason for not deleted from C can be either proposal
or commit failed.
>>> 3) Now for application, which is connected to C, ZooKeeper.exists  
>>> returns the znod1 and that is why application enters into node exists 
>>> flow which is wrong
>>> 
>>> shall I check exists from leader only? but even leader can have some 
>>> node undeleted in the above scenario Any guideline to handle the above said valid
data inconsistency ??
>>> 
>>> Any suggestion/help is highly appreciated.
>>> 
>>> Best Regards
>>> Mohammad Arshad
>>> HUAWEI TECHNOLOGIES CO.LTD.
>>> Huawei Tecnologies India Pvt. Ltd.
>>> Near EPIP Industrial Area, Kundalahalli Village Whitefield, 
>>> Bangalore-560066 www.huawei.com<http://www.huawei.com/>
>>> ----------------------------------------------------------------------
>>> -------------------------------------------
>>> This e-mail and its attachments contain confidential information from 
>>> HUAWEI, which is intended only for the person or entity whose address 
>>> is listed above. Any use of the information contained herein in any 
>>> way (including, but not limited to, total or partial disclosure, 
>>> reproduction, or dissemination) by persons other than the intended
>>> recipient(s) is prohibited. If you receive this e-mail in error, 
>>> please notify the sender by phone or email immediately and delete it!
>>> 
>> 
> 


Mime
View raw message