ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-6071) Client may detect necessity for reconnect for too long
Date Tue, 14 Nov 2017 12:27:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251312#comment-16251312
] 

ASF GitHub Bot commented on IGNITE-6071:
----------------------------------------

Github user alamar closed the pull request at:

    https://github.com/apache/ignite/pull/2906


> Client may detect necessity for reconnect for too long
> ------------------------------------------------------
>
>                 Key: IGNITE-6071
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6071
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.1
>            Reporter: Yakov Zhdanov
>            Assignee: Ilya Kasnacheev
>
> There was a GC pause on client that caused servers to drop client due to inability to
establish TCP communication connection. Then it took some time for client to detect that it
has been dropped. During that time client many times attempted to connect to server which
can be seen in the logs. After client detected its drop and reconnected servers fired node
added event and no log flood can be found any more.
> We need to find out why client was reconnecting via communication and did not detect
the drop for such a long time.
> I hope this can be reproduced in test:
> * start 2 servers
> * start client
> * suspend all client threads with Thread.suspend() - just filter threads of current JVM
by name and suspend ones belonging to the client.
> {noformat}
> [10:12:24,785][WARNING][disco-event-worker-#71%null%][GridDiscoveryManager] Node FAILED:
TcpDiscoveryNode [id=dd71479c-41ba-443e-b25c-3803a2a94f4f, addrs=[10.44.3.14, 127.0.0.1],
sockAddrs=[/127.0.0.1:0, XXX.com/10.44.3.14:0], discPort=0, order=2, intOrder=2, lastExchangeTime=1502269008673,
loc=false, ver=2.1.1#20170618-sha1:09ce29e0, isClient=true]
> [10:12:24,785][INFO][disco-event-worker-#71%null%][GridDiscoveryManager] Topology snapshot
[ver=5, servers=2, clients=1, CPUs=144, heap=76.0GB]
> [10:12:24,794][INFO][exchange-worker-#72%null%][time] Started exchange init [topVer=AffinityTopologyVersion
[topVer=5, minorTopVer=0], crd=false, evt=12, node=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d,
addrs=[10.44.3.11, 127.0.0.1], sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500,
order=3, intOrder=3, lastExchangeTime=1502269944782, loc=true, ver=2.1.1#20170618-sha1:09ce29e0,
isClient=false], evtNode=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d, addrs=[10.44.3.11,
127.0.0.1], sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1502269944782, loc=true, ver=2.1.1#20170618-sha1:09ce29e0, isClient=false],
customEvt=null]
> [10:12:24,813][INFO][exchange-worker-#72%null%][time] Finished exchange init [topVer=AffinityTopologyVersion
[topVer=5, minorTopVer=0], crd=false]
> [10:12:24,819][INFO][exchange-worker-#72%null%][GridCachePartitionExchangeManager] Skipping
rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=0], evt=NODE_FAILED,
node=dd71479c-41ba-443e-b25c-3803a2a94f4f]
> [10:12:28,344][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52474]
> [10:12:28,348][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52482]
> [10:12:28,356][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52506]
> [10:12:28,362][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52522]
> [10:12:28,368][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52538]
> [10:12:28,374][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52554]
> [10:12:28,380][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52570]
> [10:12:28,386][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52586]
> [10:12:28,392][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52602]
> [10:12:28,397][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52618]
> [10:12:28,402][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52634]
> [10:12:28,407][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52650]
> [10:12:28,412][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:52666]
> ...
> [10:18:32,684][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:43604]
> [10:18:32,690][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:43620]
> [10:18:32,695][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi] Accepted
incoming communication connection [locAddr=/10.44.3.11:47100, rmtAddr=/10.44.3.14:43636]
> [10:18:42,831][INFO][disco-event-worker-#71%null%][GridDiscoveryManager] Added new node
to topology: TcpDiscoveryNode [id=2e80b0f0-21db-451d-a264-34ba16e00ffa, addrs=[10.44.3.14,
127.0.0.1], sockAddrs=[/127.0.0.1:0, gbrdsr000002837.intranet.barcapint.com/10.44.3.14:0],
discPort=0, order=6, intOrder=5, lastExchangeTime=1502270322805, loc=false, ver=2.1.1#20170618-sha1:09ce29e0,
isClient=true]
> [10:18:42,832][INFO][disco-event-worker-#71%null%][GridDiscoveryManager] Topology snapshot
[ver=6, servers=2, clients=2, CPUs=144, heap=90.0GB]
> [10:18:42,833][INFO][exchange-worker-#72%null%][time] Started exchange init [topVer=AffinityTopologyVersion
[topVer=6, minorTopVer=0], crd=false, evt=10, node=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d,
addrs=[10.44.3.11, 127.0.0.1], sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500,
order=3, intOrder=3, lastExchangeTime=1502270322815, loc=true, ver=2.1.1#20170618-sha1:09ce29e0,
isClient=false], evtNode=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d, addrs=[10.44.3.11,
127.0.0.1], sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1502270322815, loc=true, ver=2.1.1#20170618-sha1:09ce29e0, isClient=false],
customEvt=null]
> [10:18:42,851][INFO][exchange-worker-#72%null%][time] Finished exchange init [topVer=AffinityTopologyVersion
[topVer=6, minorTopVer=0], crd=false]
> [10:18:42,855][INFO][exchange-worker-#72%null%][GridCachePartitionExchangeManager] Skipping
rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=6, minorTopVer=0], evt=NODE_JOINED,
node=2e80b0f0-21db-451d-a264-34ba16e00ffa]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message