ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladislav Pyatkov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-4491) Commutation loss between two nodes leads to hang whole cluster.
Date Mon, 26 Dec 2016 11:28:58 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vladislav Pyatkov updated IGNITE-4491:
--------------------------------------
    Attachment: Segmentation.7z

> Commutation loss between two nodes leads to hang whole cluster.
> ---------------------------------------------------------------
>
>                 Key: IGNITE-4491
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4491
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 1.8
>            Reporter: Vladislav Pyatkov
>            Priority: Critical
>         Attachments: Segmentation.7z
>
>
> Reproduction steps:
> 1) Start nodes:
> {noformat}
> DC1                       DC2
> 1 (10.116.172.1)      8 (10.116.64.11)
> 2 (10.116.172.2)      7 (10.116.64.12)
> 3 (10.116.172.3)      6 (10.116.64.13)
> 4 (10.116.172.4)      5 (10.116.64.14)
> {noformat}
> each node have client which run in same host with server (look source in attachment).
> 2) Drop connection
> Between 1-8,
> {noformat}
> 1 (10.116.172.1)      8 (10.116.64.11)
> {noformat}
> Drop all input and output traffic
> Invoke from 10.116.172.1
> {noformat}
> iptables -A INPUT -s 10.116.64.11 -j DROP
> iptables -A OUTPUT -d 10.116.64.11 -j DROP
> {noformat}
> Between  4-5
> {noformat}
> 4 (10.116.172.4)      5 (10.116.64.14)
> {noformat}
> Invoke from 10.116.172.4
> {noformat}
> iptables -A INPUT -s 10.116.64.14 -j DROP
> iptables -A OUTPUT -d 10.116.64.14 -j DROP
> {noformat}
> 3) Stop the grid, after several seconds
> If you are looking into logs, you can find which node was segmented (pay attention, which
clients did not segmented.), after drop traffic:
> {noformat}
> [12:04:33,914][INFO][disco-event-worker-#211%null%][GridDiscoveryManager] Topology snapshot
[ver=18, servers=6, clients=8, CPUs=456, heap=68.0GB]
> {noformat}
> And all operations stopped at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message