lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <francois.groll...@barclays.com>
Subject RE: recovery process - node with stale data elected leader
Date Thu, 06 Nov 2014 10:32:08 GMT
Hi all,

Any idea on my issue below?

Thanks
Francois

-----Original Message-----
From: Grollier, Francois: IT (PRG) 
Sent: Tuesday, November 04, 2014 6:19 PM
To: solr-user@lucene.apache.org
Subject: recovery process - node with stale data elected leader

Hi,

I'm running solrCloud 4.6.0 and I have a question/issue regarding the recovery process.

My cluster is made of 2 shards with 2 replicas each. Nodes A1 and B1 are leaders, A2 and B2
followers.

I start indexing docs and kill A2. I keep indexing for a while and then kill A1. At this point,
the cluster stops serving queries as one shard is completely unavailable.
Then I restart A2 first, then A1. A2 gets elected leader, waits a bit for more replicas to
be up and once it sees A1 it starts the recovery process.
My understanding of the recovery process was that at this point A2 would notice that A1 has
a more up to date state and it would sync with A1. It seems to happen like this but then I
get:

INFO  - 2014-11-04 11:50:43.068; org.apache.solr.cloud.RecoveryStrategy; Attempting to PeerSync
from http://a1:8111/solr/executions/ core=executions - recoveringAfterStartup=false INFO 
- 2014-11-04 11:50:43.069; org.apache.solr.update.PeerSync; PeerSync: core=executions url=http://a2:8211/solr
START replicas=[http://a1:8111/solr/executions/] nUpdates=100 INFO  - 2014-11-04 11:50:43.076;
org.apache.solr.update.PeerSync; PeerSync: core=executions url=http://a2:8211/solr  Received
98 versions from a1:8111/solr/executions/ INFO  - 2014-11-04 11:50:43.076; org.apache.solr.update.PeerSync;
PeerSync: core=executions url=http://a2:8211/solr  Our versions are newer. ourLowThreshold=1483859630192852992
otherHigh=1483859633446584320 INFO  - 2014-11-04 11:50:43.077; org.apache.solr.update.PeerSync;
PeerSync: core=executions url=http://a2:8211/solr DONE. sync succeeded


And I end up with a different set of documents in each node (actually A1 has all the documents
but A2 misses some).

Is my understanding wrong and is it a completely nonsense to start A2 before A1?

If my understanding right, what could cause the desync? (I can provide more logs) And is there
a way to force A2 to index the missing documents? I have try the FORCERECOVERY command but
it generates the same result as shown above.

Thanks
francois

_______________________________________________

This message is for information purposes only, it is not a recommendation, advice, offer or
solicitation to buy or sell a product or service nor an official confirmation of any transaction.
It is directed at persons who are professionals and is not intended for retail customer use.
Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer.

For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding
market commentary from Barclays Sales and/or Trading, who are active market participants;
and in respect of Barclays Research, including disclosures relating to specific issuers, please
see http://publicresearch.barclays.com.

_______________________________________________
_______________________________________________

This message is for information purposes only, it is not a recommendation, advice, offer or
solicitation to buy or sell a product or service nor an official confirmation of any transaction.
It is directed at persons who are professionals and is not intended for retail customer use.
Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer.

For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding
market commentary from Barclays Sales and/or Trading, who are active market participants;
and in respect of Barclays Research, including disclosures relating to specific issuers, please
see http://publicresearch.barclays.com.

_______________________________________________

Mime
View raw message