zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enrico Olivelli (Jira)" <j...@apache.org>
Subject [jira] [Updated] (ZOOKEEPER-1032) speed up recovery from leader failure
Date Fri, 06 Sep 2019 15:45:07 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enrico Olivelli updated ZOOKEEPER-1032:
---------------------------------------
    Fix Version/s:     (was: 3.5.6)

> speed up recovery from leader failure
> -------------------------------------
>
>                 Key: ZOOKEEPER-1032
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1032
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>            Reporter: jiangwen wei
>            Priority: Major
>             Fix For: 3.6.0, 3.5.7
>
>
> when the number of nodes is large, it may take a long time to recover from leader failure
> there are some points to improve:
> 1. Follower should take snapshot asynchronously when follower up to date
> 2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore
it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from
transaction logs.
> 3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not
behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Mime
View raw message