lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3939) Solr Cloud recovery and leader election when unloading leader core
Date Wed, 17 Oct 2012 18:00:04 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478081#comment-13478081
] 

Yonik Seeley commented on SOLR-3939:
------------------------------------

Mark & I were chatting a little about this.  The easiest fix would seem to be
- if a single remaining replica is active, it should always become the leader (even if it
has no recent versions)
- if an existing replica comes back up and tries to sync with this new leader, it should fail
(or somehow be forced to replicate from the new leader)

That still begs the question... what if two replicas are in the situation of having no recent
versions because they both just finished replicating?

Anther solution (which is more difficult and would take longer) is to store some number of
latest versions in the commit data.
                
> Solr Cloud recovery and leader election when unloading leader core
> ------------------------------------------------------------------
>
>                 Key: SOLR-3939
>                 URL: https://issues.apache.org/jira/browse/SOLR-3939
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.0-BETA, 4.0
>            Reporter: Joel Bernstein
>            Assignee: Mark Miller
>            Priority: Critical
>              Labels: 4.0.1_Candidate
>             Fix For: 4.1, 5.0
>
>         Attachments: cloud.log, SOLR-3939.patch
>
>
> When a leader core is unloaded using the core admin api, the followers in the shard go
into recovery but do not come out. Leader election doesn't take place and the shard goes down.
> This effects the ability to move a micro-shard from one Solr instance to another Solr
instance.
> The problem does not occur 100% of the time but a large % of the time. 
> To setup a test, startup Solr Cloud with a single shard. Add cores to that shard as replicas
using core admin. Then unload the leader core using core admin. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message