lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: 7.2.1 cluster dies within minutes after restart
Date Sat, 27 Jan 2018 09:03:05 GMT
Hello,

I grepped for it yesterday and found nothing but 30000 in the settings, but judging from the
weird time out value, you may be right. Let me apply your patch early next week and check
for spurious warnings.

Another note worthy observation for those working on cloud stability and recovery, whenever
this happens, some nodes are also absolutely sure to run OOM. The leaders usually live longest,
the replica's don't, their heap usage peaks every time, consistently. 

Thanks,
Markus
 
-----Original message-----
> From:Shawn Heisey <apache@elyograg.org>
> Sent: Saturday 27th January 2018 0:49
> To: solr-user@lucene.apache.org
> Subject: Re: 7.2.1 cluster dies within minutes after restart
> 
> On 1/26/2018 10:02 AM, Markus Jelsma wrote:
> > o.a.z.ClientCnxn Client session timed out, have not heard from server in 22130ms
(although zkClientTimeOut is 30000).
> 
> Are you absolutely certain that there is a setting for zkClientTimeout
> that is actually getting applied?  The default value in Solr's example
> configs is 30 seconds, but the internal default in the code (when no
> configuration is found) is still 15.  I have confirmed this in the code.
> 
> Looks like SolrCloud doesn't log the values it's using for things like
> zkClientTimeout.  I think it should.
> 
> https://issues.apache.org/jira/browse/SOLR-11915
> 
> Thanks,
> Shawn
> 
> 

Mime
View raw message