Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: "Nazik Huq" <nazikhuq@yahoo.com>
To: <solr-user@lucene.apache.org>
Subject: SolrCloud recovery after nodes are rebooted in rapid succession 
Date: Thu, 6 Mar 2014 18:15:43 -0500
Message-ID: <007e01cf3992$00641080$012c3180$@yahoo.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_007F_01CF3968.178FB630"
Thread-Index: Ac85kbw0o9PEbmVPSkCIEUCfdwL0uw==
Content-Language: en-us

------=_NextPart_000_007F_01CF3968.178FB630
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Hello,

 
I have a question from a colleague who's managing a 3-node(VMs) SolrCloud
cluster with a separate 3-node Zookeeper ensemble. Periodically the  data
center underneath the SolrCloud decides to upgrade the SolrCloud instance
infrastructure in  a "rolling upgrade" fashion. So after the 1st instance of
the SolrCloud is shut down and while it is in the process of rebooting, the
2nd  instance starts to shut down  and so on. Eventually all three Solr
instances are rebooted and up and running  but the cluster in now
inoperable.  Meaning clients can't query or  ingest data. My colleague is
trying to ascertain if this problem is due to Solr's inability to recover
from a rapid succession of reboots of the nodes or from the data center
upgrade that is triggering a "situation" making SolrCloud inoperable.

 
My question is, can a SolrCloud cluster become inoperable after its nodes
are rebooted in rapid succession as described above? Is there an edge case
similar to this?

 
Thanks,

 
Nazik Huq

 
------=_NextPart_000_007F_01CF3968.178FB630--