From Sergii Tyshlek <stysh...@llnw.com>
Subject Rebalance is skipped
Date Tue, 16 Aug 2016 09:54:10 GMT
Hello there!

Some time ago we started moving from old GridGain to current Apache Ignite
(1.6, now 1.7).

Here are some cache config properties we use:




At first, everything looked OK, but then I noticed that our data is not
distributed evenly between nodes (despite the fact we use
FairAffinityFunction, coordinator node hoards most of the cache entries).
Later I discovered (through wrapping my custom class around
FairAffinityFunction) that it works as expected, but the rebalancing is not.

Short after starting 6 nodes (after the last one joins the topology), such
debug logs appear:

2016-08-15 07:36:43,070 DEBUG [exchange-worker-#318%EQGrid%]
[preloader.GridDhtPreloader] - <p_queryResults> Skipping partition
assignment (state is not MOVING): GridDhtLocalPartition [id=0,
currentMapImpl@5c35248d, rmvQueue=GridCircularBuffer [sizeMask=255,
idxGen=0], cntr=0, state=OWNING, reservations=0, empty=true,
createTime=08/15/2016 07:36:33]
// repeats total of 1024 times, where id=0..1023, one for every partition
// then it's followed by
2016-08-15 07:36:43,177 DEBUG [exchange-worker-#318%EQGrid%]
[preloader.GridDhtPartitionDemander] - <p_queryResults> Adding partition
assignments: GridDhtPreloaderAssignments [topVer=AffinityTopologyVersion
[topVer=6, minorTopVer=0], cancelled=false, exchId=GridDhtPartitionExchangeId
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], nodeId=383991fb,
evt=NODE_JOINED], super={}]
2016-08-15 07:36:43,177 DEBUG [exchange-worker-#318%EQGrid%]
[preloader.GridDhtPartitionDemander] - <p_queryResults> Rebalancing is not
required [cache=p_queryResults, topology=AffinityTopologyVersion [topVer=6,
2016-08-15 07:36:43,178 DEBUG [exchange-worker-#318%EQGrid%]
[preloader.GridDhtPartitionDemander] - <p_queryResults> Completed rebalance
future: RebalanceFuture [sndStoppedEvnt=false,
topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], updateSeq=6]
2016-08-15 07:36:43,179 INFO [exchange-worker-#318%EQGrid%]
[cache.GridCachePartitionExchangeManager] - Skipping rebalancing (nothing
scheduled) [top=AffinityTopologyVersion [topVer=6, minorTopVer=0],
evt=NODE_JOINED, node=383991fb-5453-4893-9040-1baa1291881a]

So I started digging. Using GridDhtPartitionTopology, I got partitions map,
which (aggregated) looked like this:
Node: 38ae4165-474d-4ed4-a292-cca78b8df5c3, partitions: {MOVING=340}
Node: 8ac8d327-dc59-473f-a3e1-c5861f63f0e6, partitions: {MOVING=341}
Node: c7047158-9e7b-494f-bceb-3a5774853a6c, partitions: {MOVING=342}
Node: c9cc1a1f-f037-43c8-8855-0f1ccb8f0ec5, partitions: {MOVING=342}
Node: dce874ff-cc1e-41c8-9e82-abfb3dfa535e, partitions: {OWNING=1024}
Node: de783f6d-dc48-46b8-a387-91dd3d181150, partitions: {MOVING=342}

Important point is that such distribution never changes, neither right
after grid start, nor after few hours. Ingesting (or not ingesting) data
also doesn't seem to affect this. Changing rebalanceDelay and commenting
out affinityMapper also made no difference.
>From what I'm seeing, affinity function distributes partitions evenly (6
nodes, ~341 partitions each = 2048, i.e. 1024 partitions and a backup), but
the coordinator node just never releases 1024-341=683 partitions, being an
owner of every partition in a grid.

Please, help me understand what might cause such behavior. I included logs
and properties, which seemed relevant to the issue, but I'll provide more
if needed.

- regards, Sergii

