lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: state.json being downloaded every 10 seconds
Date Tue, 17 May 2016 04:44:11 GMT
bq: One thing that still feels a bit odd though is that the health
check query was referencing a collection that no longer existed in the
cluster. So it seems like it was downloading the state for ALL
non-hosted collections, not a requested one.

This is a bit odd, I don't know whether there's fallback logic like
"if there's no such collection look at them all". If you're _sure_
this is what's happening, and especially if you can provide a test
case this is worth a JIRA to at least insure that it's intended
behavior.

Best,
Erick

On Mon, May 16, 2016 at 9:28 PM, Jeff Wartes <jwartes@whitepages.com> wrote:
>
> Ah, I tracked this down to an haproxy that was set up on a load server during development
and still running. It was configured with a health check every 10 seconds, so that’s pretty
clearly the cause. Thanks for the pointer.
>
> One thing that still feels a bit odd though is that the health check query was referencing
a collection that no longer existed in the cluster. So it seems like it was downloading the
state for ALL non-hosted collections, not a requested one.
>
> This touches a bit on a sore point with me. I dislike that those collection-not-here
proxy requests aren’t logged on the server doing the proxy, because you end up with traffic
visible at the http interface but not the solr level. Honestly, I dislike that transparent
proxy approach in general, because it means I lose the ability to dedicate entire nodes to
the fan-out and shard-aggregation process like I could pre-solrcloud.
>
>
>
>
> On 5/16/16, 8:56 PM, "Erick Erickson" <erickerickson@gmail.com> wrote:
>
>>With the per-collection state.json, if "something" goes to a node that doesn't
>>host a replica for a node, it downloads the state for the "other"
>>collection then
>>throws it away.
>>
>>In this case, "something" is apparently asking the nodes hosting collectionA to
>>do "something" with collections B and/or C. Some support for this would
>>be if further investigation shows that the nodes that _do_ re-download the
>>info did _not_ have replicas B and C.
>>
>>What the "something" is that sends requests I'm not quite sure, but
>>that's a place
>>to start.
>>
>>Best,
>>Erick
>>
>>On Mon, May 16, 2016 at 11:08 AM, Jeff Wartes <jwartes@whitepages.com> wrote:
>>>
>>> I have a solr 5.4 cluster with three collections, A, B, C.
>>> Nodes either host replicas for collection A, or B and C. Collections B and C
are not currently used - no inserts or queries. Collection A is getting significant query
traffic, but no insert traffic, and queries are only directed to nodes hosting replicas for
collection A. ZK timeout is set to 15 seconds.
>>>
>>> I’ve noticed via tcpdump that, every 10 seconds exactly, several of the nodes
(but not all) hosting collection A re-download the state.json for collections B and C. This
behavior survives JVM restart.
>>>
>>> This isn’t a huge deal, the extra traffic isn’t very meaningful, but it’s
odd and smells like a bug somewhere. Anyone seen something like this?
>>>
>>>

Mime
View raw message