zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fangmin Lv (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ZOOKEEPER-3500) Improving the ZAB UPTODATE semantic to only issue it to learner when there is limited lagging
Date Thu, 08 Aug 2019 17:13:00 GMT
Fangmin Lv created ZOOKEEPER-3500:
-------------------------------------

             Summary:  Improving the ZAB UPTODATE semantic to only issue it to learner when
there is limited lagging
                 Key: ZOOKEEPER-3500
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3500
             Project: ZooKeeper
          Issue Type: Improvement
          Components: server
            Reporter: Fangmin Lv
            Assignee: Fangmin Lv


With large snapshot and high write RPS, when learner is having SNAP syncing with leader, there
will be lots of txns need to be replayed between NEWLEADER and UPTODATE packet.
 
Depends how big the snapshot and traffic is, from our benchmark, it may take more than 30s
to replay all those txns, which means when we process the UPTODATE packet, it's still 30s
lagging behind, with 10K/s txn that's 300K txns lagging. 
 
And we start to serve client traffic just after we received UPTODATE packet, which means client
will see lots of stale data.
 
The idea here is trying to check and only send UPTODATE packet when there is limited txns
lagging behind from leader side. It doesn't change the ZAB protocol, but changed the time
when ZK is applying the txns between NEWLEADER and UPTODATE. 
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message