sling-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Egli (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SLING-7830) Defined leader switch
Date Mon, 20 Aug 2018 10:15:00 GMT

    [ https://issues.apache.org/jira/browse/SLING-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16585747#comment-16585747
] 

Stefan Egli edited comment on SLING-7830 at 8/20/18 10:14 AM:
--------------------------------------------------------------

(btw: with 'duplicates' I was referring to a temporary situation that can potentially happen
while the topology change is going on - leading to a situation where a non-leader instance
is told it's the leader while the old leader hasn't been downgraded to non-leader yet. It
will resolve itself as soon as all topology events are sent/consumed.)
bq.  From the discussion so far I understand that "0_" will become the new leader as the leaderElectionId
is lowest.
right
bq.  And this would cause duplicate events for topology changes?
No, as long as it has the leaderElectionId already set to "0_" at the time it joins, there's
no glitch. The glitch only happens if the leaderElectionId is changed "at runtime" while already
in the cluster (and the id change actually results in a leader change)

So I question is just, when exactly do the new instances get their newly correct leaderElectionId
set.

In my first approach I was using the default behaviour, which is, they get a "1_" prefix -
but before joining, increment the existing ones so that they "step back from wanting to be
leader".
In your approach I'm not clear when the leaderElectionId would be changed, but I'm guessing
it would happen at runtime. If we can set it before they join, we're fine (but how, that would
be my question then).


was (Author: egli):
(btw: with 'duplicates' I was referring to a temporary situation that can potentially happen
while the topology change is going on - leading to a situation where two instance are told
they're leader. It will resolve itself as soon as all topology events are sent/consumed.)
bq.  From the discussion so far I understand that "0_" will become the new leader as the leaderElectionId
is lowest.
right
bq.  And this would cause duplicate events for topology changes?
No, as long as it has the leaderElectionId already set to "0_" at the time it joins, there's
no glitch. The glitch only happens if the leaderElectionId is changed "at runtime" while already
in the cluster (and the id change actually results in a leader change)

So I question is just, when exactly do the new instances get their newly correct leaderElectionId
set.

In my first approach I was using the default behaviour, which is, they get a "1_" prefix -
but before joining, increment the existing ones so that they "step back from wanting to be
leader".
In your approach I'm not clear when the leaderElectionId would be changed, but I'm guessing
it would happen at runtime. If we can set it before they join, we're fine (but how, that would
be my question then).

> Defined leader switch
> ---------------------
>
>                 Key: SLING-7830
>                 URL: https://issues.apache.org/jira/browse/SLING-7830
>             Project: Sling
>          Issue Type: Improvement
>          Components: Discovery
>            Reporter: Carsten Ziegeler
>            Priority: Major
>
> The current leader selection is based on startup time and sling id (mainly) and is stable
across changed in the topology for as long as the leader is up and running.
> However there are use cases like blue green deployment where new instances with a new
version are started and taking over the functionality. However with the current discovery
setup, the leader would still be one of the instances with the old version.
> With a new deployed version, tasks currently bound to the leader should run on the new
version.
> Therefore the leader needs to switch and stay the leader (until it dies).
> We probably need an additional criteria for the leader selection
> /cc [~egli]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message