lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <elyog...@elyograg.org>
Subject Re: Replicas do not come up after nodes are restarted in SOLR cloud
Date Wed, 05 Sep 2018 13:52:33 GMT
On 9/5/2018 2:55 AM, Sudip Mukherjee wrote:
> I have a 2 node SOLR (7.x) cloud cluster on which I have collection with replicas ( replicationFactor
= 2, shard = 1 ). I am seeing that the replicas do not come up ( state is "down")  when both
nodes are restarted. From the "legend" in Graph section, I see that the replicas are in
> "recovery failed" state.
<snip>
> Caused by: java.net.SocketTimeoutException: Read timed out
>                  at java.net.SocketInputStream.socketRead0(Native Method)
>                  at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)

Have you changed the socket timeout in Solr's config?

The socket timeout for internode requests defaults to 60 seconds.  If 
something happened that prevented a Solr server from responding within 
60 seconds, then there's something *REALLY* wrong.

My best guess is that your Solr heap is too small, causing Java to spend 
almost all of its time doing garbage collection.  Or that a too-small 
heap has caused one of your servers to experience an OutOfMemoryError, 
which on non-Windows systems will result in the Solr process being killed.

Some questions in case that's not it:

How many collections do you have on this setup?

In the admin UI (Cloud tab), what hostname do your nodes show they are 
registered as?  If it's localhost, that's going to be a problem for a 
2-node cluster.

Thanks,
Shawn


Mime
View raw message