lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Potter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-6236) Need an optional fallback mechanism for selecting a leader when all replicas are in leader-initiated recovery.
Date Wed, 13 Aug 2014 20:14:13 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096040#comment-14096040
] 

Timothy Potter commented on SOLR-6236:
--------------------------------------

I wasn't able to get a test working that showed a replica not being able to become the leader
after it loses its ZK session and its leader goes down around the same time (on trunk). Every
time I ran that scenario, the replica became the leader, which is a good thing. I've heard
about cases in the field where this happens, so am still trying to simulate it in a test environment.
Basically, I've tried expiring the ZK session on the replica and then killing the Jetty hosting
the leader and the replica always becomes the leader as expected.

Also, I'm reworking / rethinking this patch as the previous approach works fine in a test
environment but won't work in general. The problem is when a replica is trying to decide if
it should force itself to be the leader, it doesn't really take the state of other replicas
into account. It just kind of assumes the others are in a bad state since it can't recover.
So in one case, a replica could decide to not force itself thinking another replica will do
it, which might not ever happen. Conversely, it could decide to force itself when a better
candidate is maybe just being slow at becoming the leader. Mainly, I think these are areas
that need more investigation before this approach is vetted out. I definitely like giving
operators the ability to "force_leader" by updating the leader-initiated recovery status for
a replica, but I'm not so sure about a replica doing that itself (without the intervention
of a human operator).

> Need an optional fallback mechanism for selecting a leader when all replicas are in leader-initiated
recovery.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6236
>                 URL: https://issues.apache.org/jira/browse/SOLR-6236
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-6236.patch
>
>
> Offshoot from discussion in SOLR-6235, key points are:
> Tim: In ElectionContext, when running shouldIBeLeader, the node will choose to not be
the leader if it is in LIR. However, this could lead to no leader. My thinking there is the
state is bad enough that we would need manual intervention to clear one of the LIR znodes
to allow a replica to get past this point. But maybe we can do better here?
> Shalin: Good question. With careful use of minRf, the user can retry operations and maintain
consistency even if we arbitrarily elect a leader in this case. But most people won't use
minRf and don't care about consistency as much as availability. For them there should be a
way to get out of this mess easily. We can have a collection property (boolean + timeout value)
to force elect a leader even if all shards were in LIR. What do you think?
> Mark: Indeed, it's a current limitation that you can have all nodes in a shard thinking
they cannot be leader, even when all of them are available. This is not required by the distributed
model we have at all, it's just a consequence of being over restrictive on the initial implementation
- if all known replicas are participating, you should be able to get a leader. So I'm not
sure if this case should be optional. But iff not all known replicas are participating and
you still want to force a leader, that should be optional - I think it should default to false
though. I think the system should default to reasonable data safety in these cases.
> How best to solve this, I'm not quite sure, but happy to look at a patch. How do you
plan on monitoring and taking action? Via the Overseer? It seems tricky to do it from the
replicas.
> Tim: We have a similar issue where a replica attempting to be the leader needs to wait
a while to see other replicas before declaring itself the leader, see ElectionContext around
line 200:
> int leaderVoteWait = cc.getZkController().getLeaderVoteWait();
> if (!weAreReplacement)
> { waitForReplicasToComeUp(weAreReplacement, leaderVoteWait); }
> So one quick idea might be to have the code that checks if it's in LIR see if all replicas
are in LIR and if so, wait out the leaderVoteWait period and check again. If all are still
in LIR, then move on with becoming the leader (in the spirit of availability).
> {quote}
> But iff not all known replicas are participating and you still want to force a leader,
that should be optional - I think it should default to false though. I think the system should
default to reasonable data safety in these cases.
> {quote}
> Shalin: That's the same case as the leaderVoteWait situation and we do go ahead after
that amount of time even if all replicas aren't participating. Therefore, I think that we
should handle it the same way. But to help people who care about consistency over availability,
there should be a configurable property which bans this auto-promotion completely.
> In any case, we should switch to coreNodeName instead of coreName and open an issue to
improve the leader election part.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message