ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Denis Magda (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-1977) IgniteSemaphore's failover related tests lead to the deadlock
Date Mon, 30 Nov 2015 13:30:11 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031767#comment-15031767
] 

Denis Magda commented on IGNITE-1977:
-------------------------------------

Finished improving and reworking IgniteSemaphore failover tests.

Presently we have the following situation:
- both testSemaphoreFailoverSafe and testSemaphoreNonFailoverSafe always pass;
- the rest of the tests are failing from time to time and I've "muted" them deliberately by
inserting "fail("https://issues.apache.org/jira/browse/IGNITE-1977")" in doTestSemaphore()
method implementation.

Vladislav, would you mind taking look at the failing tests and fix the implementation of IgniteSemaphore?

You need to check them against all the suites that extend GridCacheAbstractDataStructuresFailoverSelfTest.

> IgniteSemaphore's failover related tests lead to the deadlock
> -------------------------------------------------------------
>
>                 Key: IGNITE-1977
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1977
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Magda
>            Assignee: Denis Magda
>         Attachments: ignite-1977.patch
>
>
> All {{IgniteSemaphore}} related tests from {{GridCacheAbstractDataStructuresFailoverSelfTest}}
may cause a deadlock which leads to the whole suite hanging.
> The threads are waiting for the following condition:
> {noformat}
> "topology-change-thread-3" prio=6 tid=0x000000001d98d800 nid=0x2b20 waiting on condition
[0x000000002066f000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x0000000798149948> (a org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl$Sync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> 	at org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl.acquire(GridCacheSemaphoreImpl.java:538)
> 	at org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl.acquire(GridCacheSemaphoreImpl.java:525)
> 	at org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest$7.apply(GridCacheAbstractDataStructuresFailoverSelfTest.java:571)
> 	at org.apache.ignite.internal.util.lang.GridAbsClosure.run(GridAbsClosure.java:50)
> 	at org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:967)
> 	at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
> {noformat}
> Probably the semaphore is not properly released when a node leaves the topology abruptly.
> In addition the tests should be rewritten to the way which is followed by other data
structures and atomics from this suite: using {{ConstantTopologyChangeWorker}} and its descendants.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message