ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Chugunov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-10238) Intermittent Client Nodes suite hang
Date Tue, 18 Dec 2018 07:14:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723767#comment-16723767
] 

Sergey Chugunov commented on IGNITE-10238:
------------------------------------------

I investigated the code a bit more and found out the history of this change: in order to
offload plain old *disco-msg-worker* thread IGNITE-9398 introduced new thread *disco-notifier-worker.*

But it was't marked by _IgniteDiscoveryThread_ interface and later fix IGNITE-9895 didn't
solve the issue.

We need to put this marker interface to _IgniteThread_ that runs _DiscoveryMessageNotifierWorker_ and
not to worker itself.

> Intermittent Client Nodes suite hang
> ------------------------------------
>
>                 Key: IGNITE-10238
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10238
>             Project: Ignite
>          Issue Type: Test
>            Reporter: Alexey Goncharuk
>            Assignee: Amelchev Nikita
>            Priority: Critical
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.8
>
>
> There are occasional hangs of Client Nodes suite in master. A quick peek at the thread
dumps reveals an interesting deadlock (only relevant parts of the thread dump are left):
> {code}
> "disco-notifier-worker-#634%internal.IgniteClientReconnectApiExceptionTest0%" #791 prio=5
os_prio=0 tid=0x00007f990c12d800 nid=0x11b9 waiting on condition [0x00007f991a0eb000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.metadata(CacheObjectBinaryProcessorImpl.java:656)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.metadata(CacheObjectBinaryProcessorImpl.java:206)
> 	at org.apache.ignite.internal.binary.BinaryContext.metadata(BinaryContext.java:1293)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:2007)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:286)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:185)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984)
> 	at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703)
> 	at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188)
> 	at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984)
> 	at org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703)
> 	at org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188)
> 	at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764)
> 	at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
> 	at org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:313)
> 	at org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:101)
> 	at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:81)
> 	at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10131)
> 	at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10160)
> 	at org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:390)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1362)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:111)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:203)
> 	at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:194)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:725)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
> 	- locked <0x00000007b62859b8> (a java.lang.Object)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$17/432384581.run(Unknown
Source)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2665)
> 	at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2703)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at java.lang.Thread.run(Thread.java:748)
> "async-callable-runner-1" #876 prio=5 os_prio=0 tid=0x00007f990c26a000 nid=0x120b waiting
on condition [0x00007f991a2ed000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)
> 	at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:194)
> 	at org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332)
> 	at org.apache.ignite.internal.binary.BinaryClassDescriptor.write(BinaryClassDescriptor.java:777)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:223)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:164)
> 	at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:151)
> 	at org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:254)
> 	at org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:84)
> 	at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:57)
> 	at org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10213)
> 	at org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1387)
> 	at org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:666)
> 	at org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:538)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:809)
> 	at org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:476)
> 	at org.apache.ignite.internal.processors.closure.GridClosureProcessor.callAsync(GridClosureProcessor.java:449)
> 	at org.apache.ignite.internal.processors.closure.GridClosureProcessor.callAsync(GridClosureProcessor.java:420)
> 	at org.apache.ignite.internal.IgniteComputeImpl.broadcastAsync0(IgniteComputeImpl.java:635)
> 	at org.apache.ignite.internal.IgniteComputeImpl.broadcast(IgniteComputeImpl.java:611)
> 	at org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$26.apply(IgniteClientReconnectApiExceptionTest.java:578)
> 	at org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$26.apply(IgniteClientReconnectApiExceptionTest.java:574)
> 	at org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$36.call(IgniteClientReconnectApiExceptionTest.java:853)
> 	at org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$36.call(IgniteClientReconnectApiExceptionTest.java:851)
> 	at org.apache.ignite.testframework.GridTestUtils.lambda$runAsync$2(GridTestUtils.java:1003)
> 	at org.apache.ignite.testframework.GridTestUtils$$Lambda$133/962953099.run(Unknown Source)
> 	at org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:1299)
> 	at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:84)
> "tcp-disco-msg-worker-#88%internal.IgniteClientReconnectApiExceptionTest0%" #792 prio=10
os_prio=0 tid=0x00007f990c130800 nid=0x11ba waiting on condition [0x00007f997fdfc000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
> 	at org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:134)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:5648)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5455)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2836)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2610)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7186)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2699)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7117)
> 	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61)
> {code}
> Need to investigate what it is that we are trying to deserialize in the discovery thread.
From the binary metadata workflow we should be able to deserialize the value right away.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message