ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-8752) Deadlock when registering binary metadata while holding topology read lock
Date Thu, 21 Jun 2018 12:05:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519273#comment-16519273
] 

ASF GitHub Bot commented on IGNITE-8752:
----------------------------------------

Github user ilantukh closed the pull request at:

    https://github.com/apache/ignite/pull/4215


> Deadlock when registering binary metadata while holding topology read lock
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-8752
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8752
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.1
>            Reporter: Alexey Goncharuk
>            Assignee: Ilya Lantukh
>            Priority: Critical
>             Fix For: 2.6
>
>         Attachments: DeadlockTest.java
>
>
> The following deadlock was reproduced on ignite-2.4 version:
> {code}
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
> 	at org.apache.ignite.internal.MarshallerContextImpl.registerClassName(MarshallerContextImpl.java:284)
> 	at org.apache.ignite.internal.binary.BinaryContext.registerUserClassName(BinaryContext.java:1191)
> 	at org.apache.ignite.internal.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:773)
> 	at org.apache.ignite.internal.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:751)
> 	at org.apache.ignite.internal.binary.BinaryContext.descriptorForClass(BinaryContext.java:622)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:164)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:147)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:134)
> 	at org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:251)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:396)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:381)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toBinary(CacheObjectBinaryProcessorImpl.java:875)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toCacheObject(CacheObjectBinaryProcessorImpl.java:825)
> 	at org.apache.ignite.internal.processors.cache.GridCacheContext.toCacheObject(GridCacheContext.java:1783)
> 	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.runEntryProcessor(GridCacheMapEntry.java:5264)
> 	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4667)
> 	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4484)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3083)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2977)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1732)
> 	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1610)
> 	at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1270)
> 	at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:370)
> 	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:1769)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2420)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1883)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1736)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1628)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1117)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke0(GridDhtAtomicCache.java:826)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke(GridDhtAtomicCache.java:784)
> 	at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1359)
> 	at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.invoke(GatewayProtectedCacheProxy.java:1172)
> {code}
> {code}
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x000000008d108960> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> 	at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:154)
> 	at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:123)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.update(GridDhtPartitionTopologyImpl.java:1314)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processFullPartitionUpdate(GridCachePartitionExchangeManager.java:1463)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1100(GridCachePartitionExchangeManager.java:134)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:340)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:338)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:2696)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:2675)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
> 	at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
> 	at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
> {code}
> {code}
> "tcp-disco-msg-worker-#2%grid-name%" #127 prio=10 os_prio=0 tid=0x00007f9539745000 nid=0x6149
waiting on condition [0x00007f941aff3000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x000000008d188980> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.localUpdateCounters(GridDhtPartitionTopologyImpl.java:2503)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartAckRequest(GridContinuousProcessor.java:996)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$500(GridContinuousProcessor.java:103)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$3.onCustomEvent(GridContinuousProcessor.java:194)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$3.onCustomEvent(GridContinuousProcessor.java:187)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:691)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:577)
> 	- locked <0x000000008d188d08> (a java.lang.Object)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:5453)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5340)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2739)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2531)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6730)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2614)
> 	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>    Locked ownable synchronizers:
> 	- None
> {code}
> A system thread holds the topology read lock and waits for binary meta registration,
while another user thread is registering a continuous query. As a result, discovery worker
is also attempting to acquire read lock, but this read lock is blocked because exchange worker
is already enqueued for write lock.
> One of the solutions would be an ability to read partition update counters without acquiring
topology read lock, however I am not sure if this is possible.
> Another solution is never synchronously wait for metadata update inside topology read
lock. If we need to register a new type, we should acquire the class to register, release
the lock, register the class and then retry to apply an entry processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message