lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zimmermann, Thomas" <>
Subject Re: CloudSolrClient produces tons of CLUSTERSTATUS commands against single server in Cloud
Date Tue, 06 Nov 2018 18:08:05 GMT
Hi Shawn,

We¹re equally impressed by how well the server is handling it. We¹re using
Sematext for monitoring and the load on the box has been steady under 1
and not entering a swap state memory wise.

We are 100% certain the traffic is coming from the 3 web hosts running
this code. We have put some custom logging in place that logs all requests
to an access style log and stores that data in kibana/logstash. In
logstash we are able to confirm that all these requests (~40million in the
last 12 hours) are coming from our web front ends directly to a single box
in the cluster.

Our client codes is on separate servers from our solr servers and zk has
it¹s own boxes as well.

Here¹s a scrubbed pastbin of our cluster status response from that machine
that is getting all the traffic, I pulled this via browser on my local

We can attempt to update the SolrJ dependency on our lower env and see if
that fixes the problem if you think that a good course of action, but we
are also in the midst of switching over to HTTP Client to resolve the
production issues we are seeing ASAP, so I can¹t promise a timeline. If
you think there¹s a chance that will fix this, we could of course give it
a quick go.


On 11/6/18, 12:35 PM, "Shawn Heisey" <> wrote:

>On 11/6/2018 10:12 AM, Zimmermann, Thomas wrote:
>> Shawn -
>> Server performance is fine and request time are great. We are tolerating
>> the level of traffic, but the server that is taking all the hits is
>> obviously performing a bit slower than the others. Response times are
>> under 5MS avg for queries on all servers, which is within our perf
>> thresholds.
>I was asking specifically about the clusterstatus requests -- whether
>the response looks complete if you manually execute the same request and
>whether it returns quickly.  And I'd like to see the solr.log where
>these are happening.
>Knowing that requests in general are performing well is good info,
>although I have no idea how that is possible on the node that is getting
>over a thousand clusterstatus requests per second.  I would expect that
>node to be essentially dead under that much load.  Since it's apparently
>handling it fine ... that's really impressive.
>> We are running 7.4 on the client and server side, moving to 7.5 was
>> troublesome for us so we are holding off for the time being.
>I was hoping you could just upgrade the SolrJ client, which would
>involve either replacing the solrj jar or bumping the version number in
>the config for a dependency manager (things like ivy, maven, gradle,
>etc).  A 7.5 client should be pretty safe against 7.4 servers.  The
>client would be newer than the server and very close to the same
>version, which is the general recommendation for CloudSolrClient when
>the two versions cannot be identical for some reason.
>Are you absolutely sure that those requests are coming from the program
>with CloudSolrClient?  To find out, you'll need to enable the request
>log in jetty.xml (it just needs to be un-commented) and restart the
>server.  The source address is not logged in solr.log.  It's very
>important to be absolutely sure where the requests are coming from.  If
>you're running the client code on the same machine as one of your Solr
>servers, it will be difficult to be sure about the source, so I would
>definitely suggest running the client code on a completely different
>machine, so the source addresses in the request log are useful.

View raw message