ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ignite_user2016 <rishiyag...@gmail.com>
Subject frequet disconnection in ignite cluster
Date Thu, 06 Jul 2017 18:19:54 GMT
hello Igniters,

we are seeing frequent disconnection between ignite instances, we have IP
based clusters which has following configuration - 

Ignite version - 1.7.0

 <property name="discoverySpi">
            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    
                    
                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                        
                        <property name="addresses">
                            <list>
                                
                                <value>HOST_IP1:47500..47509</value>
                                <value>HOST_IP2:47500..47509</value>
                            </list>
                        </property>
                    </bean>
                </property>
            </bean>
        </property>

See the error log - 

[09:53:46,139][WARN ][tcp-disco-msg-worker-#2%WebGrid%][TcpDiscoverySpi]
Local node has detected failed nodes and started cluster-wide procedure. To
speed up failure detection please see 'Failure Detection' section under
javadoc for 'TcpDiscoverySpi'

[09:54:56,060][WARN
][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to
wait for partition map exchange [topVer=AffinityTopologyVersion
[topVer=22132, minorTopVer=0], node=d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2].
Dumping pending objects that might be the cause:
[09:54:56,060][WARN
][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Ready
affinity version: AffinityTopologyVersion [topVer=22131, minorTopVer=0]
[09:54:56,062][WARN
][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Last
exchange future: GridDhtPartitionsExchangeFuture [dummy=false,
forcePreload=false, reassign=false, discoEvt=DiscoveryEvent
[evtNode=TcpDiscoveryNode [id=d2ffb86c-5305-4cb3-96a0-874be73d610a,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, host_ip2],
sockAddrs=[host2/host_ip2:47501, 0:0:0:0:0:0:0:1%lo:47501,
/127.0.0.1:47501], discPort=47501, order=22131, intOrder=11068,
lastExchangeTime=1499352867440, loc=false, ver=1.7.0#20160801-sha1:383273e3,
isClient=false], topVer=22132, nodeId8=d3719fe1, msg=Node left:
TcpDiscoveryNode [id=d2ffb86c-5305-4cb3-96a0-874be73d610a,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, host_ip2],
sockAddrs=[host2/host_ip2:47501, 0:0:0:0:0:0:0:1%lo:47501,
/127.0.0.1:47501], discPort=47501, order=22131, intOrder=11068,
lastExchangeTime=1499352867440, loc=false, ver=1.7.0#20160801-sha1:383273e3,
isClient=false], type=NODE_LEFT, tstamp=1499352886042], crd=TcpDiscoveryNode
[id=64ce302c-9743-47bc-bf27-641015a37b81, addrs=[127.0.0.1, host_ip1],
sockAddrs=[/127.0.0.1:47500, host1/host_ip1:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1498849915139, loc=false,
ver=1.7.0#20160801-sha1:383273e3, isClient=false],
exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion
[topVer=22132, minorTopVer=0], nodeId=d2ffb86c, evt=NODE_LEFT], added=true,
initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1499352886042,
endTime=1499352886052, ignoreInterrupts=false, state=DONE],init=true,
topSnapshot=null, lastVer=null, partReleaseFut=GridCompoundFuture [rdc=null,
initFlag=1, lsnrCalls=3, done=true, cancelled=false, err=null, futs=[true,
true, true]], affChangeMsg=null, skipPreload=false,
clientOnlyExchange=false, initTs=1499352886042, centralizedAff=true,
evtLatch=0, remaining=[64ce302c-9743-47bc-bf27-641015a37b81],
srvNodes=[TcpDiscoveryNode [id=64ce302c-9743-47b
c-bf27-641015a37b81, addrs=[127.0.0.1, host_ip1],
sockAddrs=[/127.0.0.1:47500, host1/host_ip1:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1498849915139, loc=false,
ver=1.7.0#20160801-sha1:383273e3, isClient=false], TcpDiscoveryNode
[id=d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2, addrs=[127.0.0.1, host_ip2],
sockAddrs=[/127.0.0.1:47500, host2/host_ip2:47500], discPort=47500, order=4,
intOrder=3, lastExchangeTime=1499352895809, loc=true,
ver=1.7.0#20160801-sha1:383273e3, isClient=false]], super=GridFutureAdapter
[resFlag=0, res=nul
l, startTime=1499352886042, endTime=0, ignoreInterrupts=false, state=INIT]]

[10:08:37,232][WARN
][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to
wait for partition map exchange [topVer=AffinityTopologyVersion
[topVer=22134, minorTopVer=0], node=
d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2]. Dumping pending objects that might be
the cause:
[10:08:47,287][WARN
][exchange-worker-#54%WebGrid%][GridCachePartitionExchangeManager] Failed to
wait for partition map exchange [topVer=AffinityTopologyVersion
[topVer=22134, minorTopVer=0], node=
d3719fe1-84cf-4fe5-91dd-2d10abb1b3d2]. Dumping pending objects that might be
the cause:

class org.apache.ignite.IgniteException: Failed to wait for affinity ready
future for topology version: AffinityTopologyVersion [topVer=22134,
minorTopVer=0]
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.awaitTopologyVersion(GridAffinityAssignmentCache.java:526)
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:434)
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.assignments(GridAffinityAssignmentCache.java:331)
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.assignments(GridCacheAffinityManager.java:165)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.initPartitions0(GridDhtPartitionTopologyImpl.java:373)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.initPartitions(GridDhtPartitionTopologyImpl.java:340)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1057)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:86)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:324)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processMessage(GridDhtPartitionsExchangeFuture.java:1400)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$400(GridDhtPartitionsExchangeFuture.java:86)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$4.apply(GridDhtPartitionsExchangeFuture.java:1369)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$4.apply(GridDhtPartitionsExchangeFuture.java:1357)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:263)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:226)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceive(GridDhtPartitionsExchangeFuture.java:1357)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processFullPartitionUpdate(GridCachePartitionExchangeManager.java:1030)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1200(GridCachePartitionExchangeManager.java:112)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:316)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:314)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:1807)
        at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:1789)
        at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:748)
        at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:353)
        at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:277)
        at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:88)
        at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:231)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1238)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:866)
        at
org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:106)
        at
org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:829)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to wait
for topology update, cache (or node) is stopping.
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.cancelFutures(GridCacheAffinityManager.java:92)
        at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:904)
        at
org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1914)
        at
org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1860)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2266)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2229)
        at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:323)
        at org.apache.ignite.Ignition.stop(Ignition.java:224)
        at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$8.run(GridDiscoveryManager.java:1946)
        ... 1 more



Can any one guide me what tuning are require on configuration ? 

I have also noticed that CPU and JVM memory gradually rising by days on
Ignite servers.

Thanks for all your help..

Rishi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/frequet-disconnection-in-ignite-cluster-tp14411.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Mime
View raw message