ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Mashenkov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-11460) MVCC: Possible race on coordinator changing on client reconnection.
Date Mon, 11 Mar 2019 15:17:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789680#comment-16789680
] 

Andrew Mashenkov commented on IGNITE-11460:
-------------------------------------------

[~NSAmelchev],

Looks like discovery guarantees that no disco events will be occurs between client node
got ClientNodeDisconnected and Local join (after which ClientReconnected will be generated).

It is not clear how this race can be reproduced as all disco events should be serialized and
coordinator can't be changed with late exchange processing in onExchangeDone.

> MVCC: Possible race on coordinator changing on client reconnection.
> -------------------------------------------------------------------
>
>                 Key: IGNITE-11460
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11460
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Amelchev Nikita
>            Assignee: Amelchev Nikita
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I found that the wrong coordinator can be set in case of client reconnect:
> {noformat}
> assert newCrd.topologyVersion().compareTo(curCrd.topologyVersion()) > 0;
> java.lang.AssertionError
>     at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorChanged(MvccProcessorImpl.java:541)
>     at org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onLocalJoin(MvccProcessorImpl.java:416)
>     at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:851)
>     at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:601)
>     at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2681)
>     at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2719)
>     at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>     at java.lang.Thread.run(Thread.java:748)
> {noformat}
> I have attached reproducer in PR.
> The main reason is that coordinator can be changed from discovery event thread when the
client already disconnect (disconnection processed in notifier thread and change coordinator
on onDisconnected method).
> Coordinator can be changed in cases:
> 1. notifier disco thread: onDisconnected method
> 2. event disco thread: onDiscovery listener.
> and events can be processed with some delay and override coordinator that set in notifier
thread. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message