lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: CloudSolrClient (any version). Find the node your query has connected to.
Date Thu, 23 May 2019 06:54:31 GMT
Try to add &shards.info=true to your request. It will return a section telling exactly
what shards/replicas served that request with counts and all :)

Jan Høydahl

> 22. mai 2019 kl. 21:17 skrev Erick Erickson <erickerickson@gmail.com>:
> 
> You have to be a little careful here, one thing I learned relatively recently is that
there are in-memory structures that hold pointers to _all_ un-searchable docs (i.e. no new
searchers have been opened since the doc was added/updated) to support real-time get. So if
you’re indexing a _lot_ of docs that internal structure can grow quite large….
> 
> FWIW, delete-by-query is painful. Each one has to lock all indexing on all replicas while
it completes. If you can use delete-by-id it’d be better.
> 
> Let’s back up a bit and look at _why_ your nodes go into recovery…. Leave the replicas
on if you can and look for “Leader Initiated Recovery” (not sure that’s the exact phrase,
but you’ll see something very like that). If that’s the case, then one situation we’ve
seen is that a request takes too long to return from a follower. So the sequence looks like
this:
> 
> - leader gets update
> - leader indexes locally _and_ forwards to follower
> - follower is busy (and the delete-by-query could be why) and takes too long to respond
so the request times out
> - leader says “hmmm, I don’t know what happened so I’ll tell the follower to recover”.
> 
> Given your heavy update rate, there’ll be no chance for “peer sync” to fully recover
so it’ll go into full recovery. That can sometimes be fixed by simply lengthening the timeout.
> 
> Otherwise also take a look at the logs and see if you can find a root cause for the replica
going into recovery and we should see if we can fix that.
> 
> I didn’t ask what versions of Solr you’re using, but in the 7x code line (7.3 IIRC)
significant work was done to make recovery less likely.
> 
> Best,
> Erick
> 
>> On May 22, 2019, at 10:27 AM, Shawn Heisey <apache@elyograg.org> wrote:
>> 
>> On 5/22/2019 10:47 AM, Russell Taylor wrote:
>>> I will add that we have set commits to be only called by the loading program.
We have turned off soft and autoCommits in the solrconfig.xml.
>> 
>> Don't turn off autoCommit.  Regular hard commits, typically with openSearcher set
to false so they don't interfere with change visibility, are extremely important for good
Solr operation.  Without it, the transaction logs will grow out of control.  In addition to
taking a lot of disk space, that will cause a Solr restart to happen VERY slowly.  Note that
a hard commit with openSearcher set to false will be VERY fast -- doing them frequently is
usually not a problem for performance.  Sample configs in recent Solr versions ship with autoCommit
set to 15 seconds and openSearcher set to false.
>> 
>> Not using autoSoftCommit is a reasonable thing to do if you do not need that functionality
... but don't disable autoCommit.
>> 
>> Thanks,
>> Shawn
> 

Mime
View raw message