ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Pavlov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-8785) Node may hang indefinitely in CONNECTING state during cluster segmentation
Date Tue, 26 Jun 2018 16:09:29 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitriy Pavlov updated IGNITE-8785:
-----------------------------------
    Fix Version/s:     (was: 2.6)
                   2.7

> Node may hang indefinitely in CONNECTING state during cluster segmentation
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-8785
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8785
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.5
>            Reporter: Pavel Kovalenko
>            Priority: Major
>             Fix For: 2.7
>
>
> Affected test: org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest#testTopologyValidatorWithCacheGroup
> Node hangs with following stacktrace:
> {noformat}
> "grid-starter-testTopologyValidatorWithCacheGroup-22" #117619 prio=5 os_prio=0 tid=0x00007f17dd19b800
nid=0x304a in Object.wait() [0x00007f16b19df000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:931)
> 	- locked <0x0000000705ee4a60> (a java.lang.Object)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373)
> 	at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948)
> 	at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915)
> 	at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1739)
> 	at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1046)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014)
> 	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723)
> 	- locked <0x0000000705995ec0> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> 	at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151)
> 	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:649)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:882)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:845)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:833)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:799)
> 	at org.apache.ignite.testframework.junits.GridAbstractTest$3.call(GridAbstractTest.java:742)
> 	at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
> {noformat}
> It seems that node never receives acknowledgment from coordinator.
> There were some failure before:
> {noformat}
> [org.apache.ignite:ignite-core] [2018-06-10 04:59:18,876][WARN ][grid-starter-testTopologyValidatorWithCacheGroup-22][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi]
Node has not been connected to topology and will repeat join process. Check remote nodes logs
for possible error messages. Note that large topology may require significant time to start.
Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on
the starting nodes [networkTimeout=5000]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message