zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2355) Ephemeral node is never deleted if follower fails while reading the proposal packet
Date Tue, 13 Jun 2017 20:53:02 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048391#comment-16048391
] 

ASF GitHub Bot commented on ZOOKEEPER-2355:
-------------------------------------------

Github user hanm commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/112#discussion_r121793647
  
    --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Learner.java ---
    @@ -390,6 +391,7 @@ else if (qp.getType() == Leader.SNAP) {
                                 + Long.toHexString(qp.getZxid()));
                         System.exit(13);
                     }
    +                zk.getZKDatabase().setlastProcessedZxid(qp.getZxid());
    --- End diff --
    
    The fix looks good to me.
    
    I think we should also set the zxid extracted from the current proposal packet after each
proposal is [committed](https://github.com/arshadmohammad/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L460).
Otherwise the follower will have a lagged view of the committed transactions, because with
the fix in this patch, we will never do setlastProcessedZxid during a DIFF sync. For example
imagine a case like this:
    * Follower has its latest zxid with value a before DIFF SYNC happens.
    * Leader send over proposals with zxids value b, c, d. 
    * Follower received and applied proposals b and c. Before follower had a chance to get
hands on d, network partition happens.
    * Now partition healed, follower will do a DIFF think again. Because the zk database would
not be reloaded from logs (it's already initialized), follower has a skewed view of the world
- it thinks it only has tnx a, but in fact it has a, b, and c. So rather asking b, c, and
d, the follower could just ask d.
    
    Anyway I think it is an optimization that might worth doing - it is not functional critical
because the idempotent nature of applying transactions.


> Ephemeral node is never deleted if follower fails while reading the proposal packet
> -----------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2355
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum, server
>    Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.1, 3.5.2, 3.5.3
>            Reporter: Mohammad Arshad
>            Assignee: Mohammad Arshad
>            Priority: Critical
>         Attachments: ZOOKEEPER-2355-01.patch, ZOOKEEPER-2355-02.patch, ZOOKEEPER-2355-03.patch,
ZOOKEEPER-2355-04.patch, ZOOKEEPER-2355-05.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the proposal
packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C, start all, assume
A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with {{SocketTimeoutException}}.
This we need to do to reproduce the scenario otherwise in production environment it happens
because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with quorum
> # Connect to any of the server, create the same ephemeral node /e1, created is success.
> # Close the session,  ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have been deleted
as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message