lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kelly, Frank" <frank.ke...@here.com>
Subject SolrCloud behavior when a ZooKeeper node goes down
Date Mon, 08 Feb 2016 20:09:24 GMT
We are running a small SolrCloud instance on AWS

Solr : Version 5.3.1
ZooKeeper: Version 3.4.6

3 x ZooKeeper nodes (with higher limits and timeouts due to being on AWS)
3 x Solr Nodes (8 GB of memory each - 2 collections with 3 shards for each collection)

Let's call the ZooKeeper nodes A, B and C.
One of our ZooKeeper nodes (B) failed a health check and was replaced due to autoscaling ,
but during this time of failover
our SolrCloud cluster became unavailable. All new connections to Solr were unable to connect
complaining about connectivity issues
and preexisting connections also had errors

These errors happened for both querys and adds

org.apache.solr.common.SolrException: Could not load collection from ZK:qa_us-east-1_here_account

at org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSolrClient.java:1205)

at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:837)

at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:805)

at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)

at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943)

at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958)

at com.here.scbe.search.solr.SolrFacadeImpl.querySearchIndex(SolrFacadeImpl.java:183)

at com.ovi.scbe.search.search.impl.SolrSearcher.searchInner(SolrSearcher.java:69)

at com.ovi.scbe.search.search.impl.SolrSearcher.search(SolrSearcher.java:56)

at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)

at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:342)

at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:342)



org.apache.solr.common.SolrException: Could not load collection from ZK:qa_us-east-1_public_index

at org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSolrClient.java:1205)

at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:837)

at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:805)

at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)

at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:107)

at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:72)

at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:86)

at com.here.scbe.search.solr.SolrFacadeImpl.addToSearchIndex(SolrFacadeImpl.java:108)

at com.ovi.scbe.search.index.impl.SolrIndexer.index(SolrIndexer.java:72)

at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)

at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:342)

at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:342)

I thought because we had configured SolrCloud to point at all three ZK nodes that the failure
of one ZK node would be OK (since we still had a quorum).
 Did I misunderstand something about SolrCloud and its relationship with ZK?

The weird thing now is that when the new ZooKeeper node (D) started up - after a few minutes
we could connect to SolrCloud again even though we were still only pointing to A,B and C (not
D).
Any thoughts on why this also happened?

Best,

-Frank

[cid:4BEEB30D-EF88-4787-B5F3-E6BF0E951BE3]
Frank Kelly
Principal Software Engineer
Predictive Analytics Team (SCBE/HAC/CDA)






HERE
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W

[cid:92482087-2AF4-4A90-9097-2CC3B0F9BFEB]<http://360.here.com/>  [cid:4FC535C5-9858-4C8C-A8E3-E656910D0DCA]
<https://twitter.com/here>   [cid:527F4AAD-8F3D-4270-94A3-D69A29E2CCBF] <https://www.facebook.com/here>
   [cid:3147AF0F-7BA9-4466-A271-0AA00F6FABB4] <https://linkedin.com/company/heremaps>
   [cid:F0105D77-5164-4306-91EC-F1F9E6E31A85] <https://www.instagram.com/here>






Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message