ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Denis Magda (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-2688) InterruptException for segmentation issues
Date Fri, 04 Mar 2016 05:53:40 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179379#comment-15179379
] 

Denis Magda commented on IGNITE-2688:
-------------------------------------

This issue described in this ticket is not a reason of the problem that is observed on your
side. The fix for this issue will simplycheck that a node is not stopping due to the segmentation
and will avoid printing the error below if the node is segmented

{noformat}
[18:16:31,629][SEVERE][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] TcpDiscoverSpi's message
worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability.
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
	at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
	at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
	at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5786)
	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2160)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
{noformat}

In your case you should check a reason of long GC pauses and probably fix it or tune VM by
increasing heap size or setting specific GC parameters [1]
In addition you may want to increase generic IgniteConfiguration.failureDetectionTimeout on
all the nodes setting it to a value bigger than GC pauses.

[1] https://apacheignite.readme.io/docs/performance-tips#tune-garbage-collection

> InterruptException for segmentation issues
> ------------------------------------------
>
>                 Key: IGNITE-2688
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2688
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Kozlov
>            Assignee: Denis Magda
>            Priority: Minor
>
> We're still seeing following exception for  segmentation issues:
> {noformat}
> [18:16:31,566][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Node is out of
topology (probably, due to short-time network problems).
> [18:16:31,566][WARNING][disco-event-worker-#46%null%][GridDiscoveryManager] Local node
SEGMENTED: TcpDiscoveryNode [id=19cf4b0f-d520-4915-be9f-813a99f945a5, addrs=[0:0:0:0:0:0:0:1,
127.0.0.1, 172.22.222.44, 192.168.1.117], sockAddrs=[work-pc/172.22.222.44:47501, /0:0:0:0:0:0:0:1:47501,
/172.22.222.44:47501, /127.0.0.1:47501, /172.22.222.44:47501, /192.168.1.117:47501], discPort=47501,
order=4, intOrder=4, lastExchangeTime=1455808591566, loc=true, ver=1.6.0#19700101-sha1:00000000,
isClient=false]
> [18:16:31,629][SEVERE][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] TcpDiscoverSpi's
message worker thread failed abnormally. Stopping the node in order to prevent cluster wide
instability.
> java.lang.InterruptedException
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
> 	at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
> 	at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5786)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2160)
> 	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [18:16:31,851][WARNING][sys-#22%null%][GridDhtAtomicCache] <cache_fad03851_2_08519933018899859>
Failed to send near update reply to node because it left grid: fad03851-2077-4b50-92b3-00ec6d85fa39
> [18:16:31,866][WARNING][disco-event-worker-#46%null%][GridDiscoveryManager] Stopping
local node according to configured segmentation policy.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message