lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Weiss <>
Subject Re: Help with Solr 1.3 lockups?
Date Thu, 15 Jan 2009 22:13:09 GMT
I've been wondering about this one myself - most of the services we  
have installed work this way, if they crash out for whatever reason  
they restart automatically (Apache, MySQL, even the OS itself).   
Failures are detected and corrected by the load balancers and also in  
some cases by the machine itself (like with kernel panics).   But not  
SOLR, and I'm not quite sure what to do to get it there.  We use Jetty  
but it's the same story.  It's not like it fails out all that often,  
but when it does it will still respond to HTTP requests (because Jetty  
itself is still working), which makes it a lot harder to detect a  
failure... I've tried writing something for nagios but the problem is  
that most responses solr would give to a request vary depending on  
index updates, so it's not like I can just take a checksum and compare  
it - and even then, it would only really alert us to the problem, we'd  
still have to go in and restart everything (personally I don't enjoy  
restarting servers from my blackberry nearly as much as I should).

I'd have to come up with something that can intelligently interpret  
the response and decide if the server's still working properly or not,  
and the processing time on that alone might make it too inefficient to  
run every few seconds, but at least with that we'd be able to tell the  
cluster "don't send anything to this server for now".  Is there some  
really obvious way to track if a particular servlet is still running  
properly (in either Tomcat or Jetty, because if Tomcat has this I'd  
switch) and restart the container if it's not?



On Jan 15, 2009, at 1:57 PM, Jerome L Quinn wrote:
> An even bigger problem is the fact that once Solr is wedged, it  
> stays that
> way until a human notices and restarts things.  The tomcat stays  
> running
> and there's no automatic detection that will either restart Solr, or
> restart the Tomcat container.
> Any suggestions on either front?
> Thanks,
> Jerry Quinn

View raw message