zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2418) txnlog diff sync can skip sending some transactions to followers
Date Thu, 04 Jul 2019 01:18:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878257#comment-16878257

Hudson commented on ZOOKEEPER-2418:

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #600 (See [https://builds.apache.org/job/ZooKeeper-trunk/600/])
ZOOKEEPER-2418: txnlog diff sync can skip sending some transactions t… (hanm: rev 4cadbb1a649b70a2243bc4c1e5f736df4d35c462)
* (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/LearnerHandler.java
* (edit) zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/LearnerHandlerTest.java

> txnlog diff sync can skip sending some transactions to followers
> ----------------------------------------------------------------
>                 Key: ZOOKEEPER-2418
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2418
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.1
>            Reporter: Nicholas Wolchko
>            Assignee: Brian Nixon
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.6.0
>   Original Estimate: 168h
>          Time Spent: 1h 40m
>  Remaining Estimate: 166h 20m
> If the leader is having disk issues so that its on disk txnlog is behind the in memory
commit log, it will send a DIFF that is missing the transactions in between the two.
> Example:
> There are 5 hosts in the cluster. 1 is the leader. 5 is disconnected.
> We commit up to zxid 1000.
> At zxid 450, the leader's disk stalls, but we still commit transactions because 2,3,4
are up and acking writes.
> At zxid 1000, the txnlog on the leader has 1-450 and the commit log has 500-1000.
> Then host 5 regains its connection to the cluster and syncs with the leader. It will
receive a DIFF containing zxids 1-450 and 500-1000.
> This is because queueCommittedProposals in the LearnerHandler just queues everything
within its zxid range. It doesn't give an error if there is a gap between peerLastZxid and
the iterator it is queueing from.

This message was sent by Atlassian JIRA

View raw message