lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Potter <thelabd...@gmail.com>
Subject Rogue query killed several replicas with OOM, after recovering - match all docs query problem
Date Fri, 19 Apr 2013 22:37:26 GMT
We had a rogue query take out several replicas in a large 4.2.0 cluster
today, due to OOM's (we use the JVM args to kill the process on OOM).

After recovering, when I execute the match all docs query (*:*), I get a
different count each time.

In other words, if I execute q=*:* several times in a row, then I get a
different count back for numDocs.

This was not the case prior to the failure as that is one thing we monitor
for.

I think I should be worried ... any ideas on how to troubleshoot this? One
thing to mention is that several of my replicas had to do full recoveries
from the leader when they came back online. Indexing was happening when the
replicas failed.

Thanks.
Tim

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message