ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Cherkasov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-8985) Node segmented itself after connRecoveryTimeout
Date Wed, 11 Jul 2018 16:23:00 GMT
Mikhail Cherkasov created IGNITE-8985:

             Summary: Node segmented itself after connRecoveryTimeout
                 Key: IGNITE-8985
                 URL: https://issues.apache.org/jira/browse/IGNITE-8985
             Project: Ignite
          Issue Type: Bug
            Reporter: Mikhail Cherkasov
         Attachments: Archive.zip

I can see the following message in logs:

[2018-07-10 16:27:13,111][WARN ][tcp-disco-msg-worker-#2] Unable to connect to next nodes
in a ring, it seems local node is experiencing connectivity issues. Segmenting local node
to avoid case when one node fails a big part of cluster. To disable that behavior set TcpDiscoverySpi.setConnectionRecoveryTimeout()
to 0. [connRecoveryTimeout=10000, effectiveConnRecoveryTimeout=10000]
[2018-07-10 16:27:13,112][WARN ][disco-event-worker-#61] Local node SEGMENTED: TcpDiscoveryNode
[id=e1a19d8e-2253-458c-9757-e3372de3bef9, addrs=[,,], sockAddrs=[/,
lab17.gridgain.local/, /], discPort=47500, order=2, intOrder=2,
lastExchangeTime=1531229233103, loc=true, ver=2.4.7#20180710-sha1:a48ae923, isClient=false]

I have failure detection time out 60_000 and during the test I had GC <25secs, so I don't
expect that node should be segmented.


Logs are attached.


This message was sent by Atlassian JIRA

View raw message