zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fangmin Lv (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (ZOOKEEPER-3394) Delay observer reconnect when all learner masters have been tried
Date Tue, 21 May 2019 21:44:00 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Fangmin Lv reassigned ZOOKEEPER-3394:

    Assignee: Brian Nixon

> Delay observer reconnect when all learner masters have been tried
> -----------------------------------------------------------------
>                 Key: ZOOKEEPER-3394
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3394
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: quorum
>    Affects Versions: 3.6.0
>            Reporter: Brian Nixon
>            Assignee: Brian Nixon
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.6.0
>          Time Spent: 50m
>  Remaining Estimate: 0h
> Observers will disconnect when the voting peers perform a leader election and reconnect
after. The delay zookeeper.observer.reconnectDelayMs was added to insulate the voting peers
from the observers returning. With a large number of peers and the observerMaster feature
active, this delay is mostly detrimental as it means that the observer is more likely to get
hung up on connecting to a bad (down/corrupt) peer and it would be better off switching to
a new one quickly.
> To retain the protective virtue of the delay, it makes sense to add a delay that after
all observer master's in the list have been tried before iterating through the list again.
In the case where observer master's are not active, this degenerates to a delay between connection
attempts on the leader.

This message was sent by Atlassian JIRA

View raw message