ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Mashenkov <andrey.mashen...@gmail.com>
Subject Re: FailureDetectionTimeOut not working
Date Thu, 20 Apr 2017 20:38:54 GMT
Hi,

Ignite uses ring topology. Almost all network exchange done via
TcpDiscoverySPI [1] (see, you bind it to port 47500). Topology updates,
cluster hearbeat use it. Failure detection timeout is a time window when
every node should send update to the NEXT node in topology via Discovery.
Also, Ignite allow nodes to communicate to each other directly via
CommunicationSPI (by deafult it is 47100 port).

>From stacktrace you can see that connection failed to communication port.

Some times users forget to open communication ports for nodes and keep only
dicovery ports open. This can cause grid operation hangs as nodes neither
able to exchange data nor leave topology.
So, if node can not be reached via communication for some time - it should
be kicked off topology. That is what you see in logs.



[1]
https://ignite.apache.org/releases/mobile/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html

On Thu, Apr 20, 2017 at 6:31 AM, <smriti.aggarwal@barclays.com> wrote:

> Hi Team,
>
>
>
> I want to configure failureDetectionTimeOut so that I can customize after
> how long the clients get disconnected, in case of server failure.
>
>
>
> Just for testing purposes, I brought up one server and one client in a
> cluster, and had below property set:
>
>
>
> Heres my config:
>
>
>
> *<?**xml version**="1.0" **encoding**="UTF-8"*
> *?>*
> <*beans **xmlns*
> *="http://www.springframework.org/schema/beans <http://www.springframework.org/schema/beans>"
      **xmlns:**xsi*
> *="http://www.w3.org/2001/XMLSchema-instance <http://www.w3.org/2001/XMLSchema-instance>"
      **xmlns:**util*
> *="http://www.springframework.org/schema/util <http://www.springframework.org/schema/util>"
      **xsi**:schemaLocation*
>
>
>
> *="        http://www.springframework.org/schema/beans <http://www.springframework.org/schema/beans>
       http://www.springframework.org/schema/beans/spring-beans-2.5.xsd <http://www.springframework.org/schema/beans/spring-beans-2.5.xsd>
       http://www.springframework.org/schema/util <http://www.springframework.org/schema/util>
       http://www.springframework.org/schema/util/spring-util-2.0.xsd <http://www.springframework.org/schema/util/spring-util-2.0.xsd>"*>
>     <*bean **class**="org.apache.ignite.configuration.IgniteConfiguration"*>
>
> *<!-- Set to true to enable grid-aware class loading for examples, default is false.
-->        *<*property **name**="peerClassLoadingEnabled" **value**="true"*/>
>         <*property **name**="failureDetectionTimeout" **value**="20000"*/>
>
>
> *        <!-- Enable events for examples. -->        *<*property **name**="includeEventTypes"*>
>             <*util**:constant **static-field**="org.apache.ignite.events.EventType.EVTS_ALL"*/>
>         </*property*>
>
>
> *<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
       *<*property **name**="discoverySpi"*>
>             <*bean **class**="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"*>
>                 <*property **name**="ipFinder"*>
>
>
> *<!-- Uncomment multicast IP finder to enable multicast-based discovery of initial
nodes. -->                    <!--<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">-->
                   *<*bean **class**="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"*>
>                         <*property **name**="addresses"*>
>                             <*list*>
>
> *<!-- In distributed environment, replace with actual host IP address. -->    
                           *<*value*>127.0.0.1:47500</*value*>
>                             </*list*>
>                         </*property*>
>                     </*bean*>
>                 </*property*>
>             </*bean*>
>         </*property*>
>
>
>     <*property **name**="cacheConfiguration"*>
>     <*bean **class**="org.apache.ignite.configuration.CacheConfiguration"*>
>         <*property **name**="name" **value**="test_NextcacheLocalStore"*/>
>         <*property **name**="cacheMode" **value**="PARTITIONED"*/>
>     </*bean*>
>     </*property*>
>
>     </*bean*>
> </*beans*>
>
>
>
> Whats happening is that when I bring down my server, the client gets disconnected before
the failureDetectionTimeOut has passed.
>
>
>
> I brought down the server @ 8:42:50, and the client gets disconnected within 10 seconds.
Here are the logs (from client):
>
>
>
> Apr 20, 2017 8:52:52 AM org.apache.ignite.logger.java.JavaLogger warning
>
> WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' configuration
property) [addr=/0:0:0:0:0:0:0:1:47100, failureDetectionTimeout=20000]
>
> Apr 20, 2017 8:52:53 AM org.apache.ignite.logger.java.JavaLogger warning
>
> WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' configuration
property) [addr=/127.0.0.1:47100, failureDetectionTimeout=20000]
>
> Apr 20, 2017 8:52:54 AM org.apache.ignite.logger.java.JavaLogger warning
>
> WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' configuration
property) [addr=NYKDWMVDI012486.INTRANET.BARCAPINT.com/10.136.138.135:47100, failureDetectionTimeout=20000]
>
> Apr 20, 2017 8:52:54 AM org.apache.ignite.logger.java.JavaLogger warning
>
> WARNING: Failed to connect to a remote node (make sure that destination node is alive
and operating system firewall is disabled on local and remote hosts) [addrs=[/0:0:0:0:0:0:0:1:47100,
/127.0.0.1:47100, NYKDWMVDI012486.INTRANET.BARCAPINT.com/10.136.138.135:47100]]
>
> Apr 20, 2017 8:52:58 AM org.apache.ignite.logger.java.JavaLogger error
>
> SEVERE: Failed to reconnect to cluster (consider increasing 'networkTimeout' configuration
property) [networkTimeout=5000]
>
> Apr 20, 2017 8:53:03 AM org.apache.ignite.logger.java.JavaLogger info
>
> INFO:
>
>
>
> >>> +---------------------------------------------------------------------------------+
>
> >>> Ignite ver. 1.7.3#20161110-sha1:10582ae13b52d679a5827b409328a452ead2f1aa
stopped OK
>
> >>> +---------------------------------------------------------------------------------+
>
> >>> Grid uptime: 00:00:21:509
>
>
>
>
>
> javax.cache.CacheException: class org.apache.ignite.IgniteClientDisconnectedException:
Failed to ping node, client node disconnected.
>
>          at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1507)
>
>          at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2138)
>
>          at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1338)
>
>          at org.gridgain.examples.Smriti.CacheLocalstore.CachePut.addEmpToCache(CachePut.java:68)
>
>          at org.gridgain.examples.Smriti.CacheLocalstore.CachePut.main(CachePut.java:34)
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>          at java.lang.reflect.Method.invoke(Method.java:601)
>
>          at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
>
> Caused by: class org.apache.ignite.IgniteClientDisconnectedException: Failed to ping
node, client node disconnected.
>
>          at org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:841)
>
>          at org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:839)
>
>          ... 10 more
>
> Caused by: class org.apache.ignite.internal.IgniteClientDisconnectedCheckedException:
Failed to ping node, client node disconnected.
>
>          at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1423)
>
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:846)
>
>         at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:990)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.mapSingle(GridNearAtomicAbstractUpdateFuture.java:269)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:504)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:434)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:209)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$23.apply(GridDhtAtomicCache.java:1150)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$23.apply(GridDhtAtomicCache.java:1148)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.asyncOp(GridDhtAtomicCache.java:846)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAsync0(GridDhtAtomicCache.java:1148)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.putAsync0(GridDhtAtomicCache.java:618)
>
>          at org.apache.ignite.internal.processors.cache.GridCacheAdapter.putAsync(GridCacheAdapter.java:2541)
>
>          at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put(GridDhtAtomicCache.java:595)
>
>          at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2215)
>
>          at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1331)
>
>          ... 7 more
>
>
>
>
>
> Smriti.
>
>
>
> _______________________________________________
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at: www.barclays.com/
> emaildisclaimer.
>
> For important disclosures, please see: www.barclays.com/
> salesandtradingdisclaimer regarding market commentary from Barclays Sales
> and/or Trading, who are active market participants; and in respect of
> Barclays Research, including disclosures relating to specific issuers,
> please see http://publicresearch.barclays.com.
>
> _______________________________________________
>



-- 
Best regards,
Andrey V. Mashenkov

Mime
View raw message