lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gus Heck <gus.h...@gmail.com>
Subject Autoscaling in 8.0
Date Fri, 18 Jan 2019 19:27:07 GMT
I'm a little worried about the state of Autoscaling. It looks like it has
the potential to create bad first experiences. Granted 8.0 isn't supposed
to be stable, but I'm seeing things that were documented for 7.6 not
working in 8x

TLDR:

   1. Default settings didn't distribute nodes evenly on brand new 50 node
   cluster
   2. Can't seem to write rules producing suggestions to distribute them
   evenly
   3. Suggestions are made that then fail despite quiet cluster, no changes.

Long version:

My Client and I did something that seems very vanilla but it didn't work
out well, and the observed behavior contradicts what's published in
https://lucene.apache.org/solr/guide/7_6/solr-upgrade-notes.html#solr-7-6
with respect to default core placement.

The cluster is a 50 node AWS cluster that was freshly set up by a client to
test out 8.0.0 (8.0.0-SNAPSHOT 69cbe29e78c400db22aab2f918405ce627d2d65d -
solr - 2019-01-11 15:41:35).

They created a collection (A) with 50 shards, one replica each (total of50
cores). They specified maxShardsPerNode=1, and nothing relating to
autoscaling. They indexed a small amount of data in (33438861 docs is small
for them) for initial testing. They then handed it over to me, and not yet
noticing anything wrong with it I added a second collection (B) similarly
configured but with schema changes for comparison. However, I noticed at
that point that the nodes page was showing a very strange result for this
seemingly vanilla set of steps. Most nodes got one core of each collection,
but not all:

Node 1 got 2 cores from A
Node 2 got 0 cores
Node 8 got 3 cores from B
Node 21 got 2 cores from A and 1 from B

I've spent all morning fiddling with rules to try to get a configuration
that provides suggestions via /api/cluster/autoscaling/suggestions to
equalize things and I just can't do it. In particular I can't ever get any
suggestion to move anything to node 2. It's as if autoscaling is
missing/unable to see node 2. A couple of times I got suggestions with
green buttons in the UI (mostly I'm using Postman however)... when I
clicked the green button it erred out saying no-node can satisfy....
Nothing's changing, no data incoming so why is it suggesting things that
don't work?

When I look at /autoscaling/diagnostics I get this seemingly impossible
result:
            {
                "node": "solr-2.customer.redacted.com:8983_solr",
                "isLive": true,
                "cores": 2,
                "freedisk": 140.03918838500977,
                "totaldisk": 147.5209503173828,
                "replicas": {}
            },

2 cores but no replicas? I looked on disk and there's no data on disk
representing a core.

-Gus

-- 
http://www.the111shift.com

Mime
View raw message