geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Schuchardt <bschucha...@pivotal.io>
Subject Re: Unit tests are hanging?
Date Wed, 07 Aug 2019 18:21:40 GMT
Yeah, that test passed on my branch in unit tests and stress tests but 
for some reason is hanging after the merge to develop. I've pushed an 
@Ignore for the test that you should pick up.

On 8/7/19 10:38 AM, Kirk Lund wrote:
> Yep, that's the same test I'm seeing in the callstacks of my dunit tgz from
> concourse...
>
> Started @ 2019-08-07 07:25:18.494 +0000
> 2019-08-07 07:51:25.252 +0000
>   org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
> testMulticastAfterReconnect
> Ended @ 2019-08-07 08:28:16.591 +0000
>
> Thanks for digging into it!
>
> On Wed, Aug 7, 2019 at 10:16 AM Ryan McMahon <rmcmahon@pivotal.io> wrote:
>
>> I think the reflection and PowerMock warnings here are probably a red
>> herring.  We pulled down the artifacts and found that the DUnit job is
>> hanging due to stuck threads in a newer DUnit test.  I am not sure why it
>> isn't failing the test but rather failing the entire job.  I believe Bruce
>> Schuchart is going to dig into this a bit.  Analysis from Lynn
>> Hughes-Godfrey:
>>
>> So, I downloaded
>>
>>
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/961
>>
>> to
>>
>>
>> /Users/lhughesgodfrey/Downloads/distributedtestfiles-OpenJDK8-1.11.0-SNAPSHOT.0010/geode-core/build/distributedTest
>>
>> In that callstacks directory, you'll see dunit-hangs.txt ... and this
>> test has been running for over an hour ...
>>
>> ```
>>
>> Started @ 2019-08-07 07:54:10.272 +0000
>>
>> 2019-08-07 08:22:25.440 +0000
>> org.apache.geode.cache30.DistributedMulticastRegionDUnitTest
>> testMulticastAfterReconnect
>>
>> Ended @ 2019-08-07 08:58:50.243 +0000
>>
>> ```
>>
>> When I look at the callstacks (at the top level), I see this:
>>
>> ```
>>
>> "Pooled Waiting Message Processor 1" #213 daemon prio=5 os_prio=0
>> tid=0x00007fc738024000 nid=0x488 waiting for monitor entry
>> [0x00007fc7ec5f7000]
>>
>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.startEventProcessing(GMSMembershipManager.java:1179)
>>
>>          - waiting to lock <0x00000000e03a75e0> (a java.lang.Object)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.readyForMessages(ClusterDistributionManager.java:1152)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.lambda$startThreads$10(ClusterDistributionManager.java:1129)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$87/1408912936.run(Unknown
>> Source)
>>
>>          at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>
>>          at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:961)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.doWaitingThread(ClusterDistributionManager.java:851)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$53/274751999.invoke(Unknown
>> Source)
>>
>>          at
>> org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:121)
>>
>>          at
>> org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$49/1824093464.run(Unknown
>> Source)
>>
>>          at java.lang.Thread.run(Thread.java:748)
>>
>>     Locked ownable synchronizers:
>>
>>          - <0x00000000e03a7660> (a
>> java.util.concurrent.ThreadPoolExecutor$Worker)
>>
>> ```
>>
>> This thread holds the lock:
>>
>> ```
>>
>> "ReconnectThread" #193 prio=5 os_prio=0 tid=0x00007fc7cc219000
>> nid=0x46a waiting on condition [0x00007fc78bdfb000]
>>
>>     java.lang.Thread.State: TIMED_WAITING (sleeping)
>>
>>          at java.lang.Thread.sleep(Native Method)
>>
>>          at
>> org.apache.geode.internal.tcp.Connection.createSender(Connection.java:959)
>>
>>          at
>> org.apache.geode.internal.tcp.ConnectionTable.handleNewPendingConnection(ConnectionTable.java:297)
>>
>>          at
>> org.apache.geode.internal.tcp.ConnectionTable.getSharedConnection(ConnectionTable.java:408)
>>
>>          at
>> org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:593)
>>
>>          at
>> org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:825)
>>
>>          at
>> org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:537)
>>
>>          at
>> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:326)
>>
>>          at
>> org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:241)
>>
>>          at
>> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:596)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1558)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.send(GMSMembershipManager.java:1739)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2849)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2776)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2813)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1523)
>>
>>          at
>> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:174)
>>
>>          at
>> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.sendReply(DistributedCacheOperation.java:1269)
>>
>>          at
>> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1132)
>>
>>          at
>> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:370)
>>
>>          at
>> org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:425)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2862)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:2549)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.access$1700(ClusterDistributionManager.java:113)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.messageReceived(ClusterDistributionManager.java:3388)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.dispatchMessage(GMSMembershipManager.java:1043)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.processStartupEvent(GMSMembershipManager.java:1154)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.startEventProcessing(GMSMembershipManager.java:1205)
>>
>>          - locked <0x00000000e03a75e0> (a java.lang.Object)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:911)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:540)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:756)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:2994)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:251)
>>
>>          - locked <0x00000000e0397568> (a java.lang.Object)
>>
>>          at
>> org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:158)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2575)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2392)
>>
>>          - locked <0x00000000e045f178> (a java.lang.Object)
>>
>>          - locked <0x00000000e047a880> (a java.lang.Class for
>> org.apache.geode.internal.cache.GemFireCacheImpl)
>>
>>          - locked <0x00000000e0512eb8> (a java.lang.Class for
>> org.apache.geode.internal.cache.InternalCacheBuilder)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1207)
>>
>>          at
>> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:3383)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1441)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager$ManagerImpl.lambda$forceDisconnect$0(GMSMembershipManager.java:2613)
>>
>>          at
>> org.apache.geode.distributed.internal.membership.adapter.GMSMembershipManager$ManagerImpl$$Lambda$198/1187156898.run(Unknown
>> Source)
>>
>>          at java.lang.Thread.run(Thread.java:748)
>>
>>     Locked ownable synchronizers:
>>
>>          - None
>>
>> ```
>>
>> Another blocked thread:
>>
>> ```
>>
>> "RMI TCP Connection(1)-172.17.0.13" #32 daemon prio=5 os_prio=0
>> tid=0x00007fc7d4001800 nid=0x27e waiting for monitor entry
>> [0x00007fc849054000]
>>
>>     java.lang.Thread.State: BLOCKED (on object monitor)
>>
>>          at java.lang.Object.wait(Native Method)
>>
>>          - waiting on <0x00000000e045f178> (a java.lang.Object)
>>
>>          at
>> org.apache.geode.distributed.internal.InternalDistributedSystem.waitUntilReconnected(InternalDistributedSystem.java:2895)
>>
>>          - locked <0x00000000e045f178> (a java.lang.Object)
>>
>>          at
>> org.apache.geode.internal.cache.GemFireCacheImpl.waitUntilReconnected(GemFireCacheImpl.java:2408)
>>
>>          at
>> org.apache.geode.cache30.DistributedMulticastRegionDUnitTest.lambda$testMulticastAfterReconnect$bb17a952$1(DistributedMulticastRegionDUnitTest.java:168)
>>
>>          at
>> org.apache.geode.cache30.DistributedMulticastRegionDUnitTest$$Lambda$200/626559185.run(Unknown
>> Source)
>>
>>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>>          at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>
>>          at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>
>>          at
>> org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
>>
>>          at
>> org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:69)
>>
>>          at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>
>>          at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>
>>          at
>> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
>>
>>          at sun.rmi.transport.Transport$1.run(Transport.java:200)
>>
>>          at sun.rmi.transport.Transport$1.run(Transport.java:197)
>>
>>          at java.security.AccessController.doPrivileged(Native Method)
>>
>>          at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>>
>>          at
>> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
>>
>>          at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
>>
>>          at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
>>
>>          at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$8/1801094145.run(Unknown
>> Source)
>>
>>          at java.security.AccessController.doPrivileged(Native Method)
>>
>>          at
>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
>>
>>          at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>
>>          at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>
>>          at java.lang.Thread.run(Thread.java:748)
>>
>>     Locked ownable synchronizers:
>>
>>          - <0x00000000e01b9da0> (a
>> java.util.concurrent.ThreadPoolExecutor$Worker)
>>
>> ```
>>
>>
>> Thanks,
>>
>> Ryan
>>
>>
>> On Wed, Aug 7, 2019 at 10:05 AM Kirk Lund <klund@apache.org> wrote:
>>
>>> We previously decided to stop using PowerMock in any Geode tests and
>> remove
>>> it. Ryan and I created https://issues.apache.org/jira/browse/GEODE-6143
>>> but
>>> it's languishing because it takes a lot of time to strip out PowerMock
>> from
>>> a test.
>>>
>>> I don't get any PowerMock warnings locally when I run unit tests.
>>>
>>> On Wed, Aug 7, 2019 at 9:50 AM Murtuza Boxwala <mboxwala@pivotal.io>
>>> wrote:
>>>
>>>> Yea.  I think it might be a red herring, because I am seeing those
>> errors
>>>> in every run, passing ones two…just double checked on
>>>>
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/948
>>>> <
>>>>
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/948
>>>> If we don’t have any other ideas, though, it might be worth removing
>>> them.
>>>> It is odd that you don’t see them locally. Let me see if I can get them
>>> to
>>>> show up…it might have something to do with the logging level.
>>>>
>>>>> On Aug 7, 2019, at 12:45 PM, Kirk Lund <klund@apache.org> wrote:
>>>>>
>>>>> I don't know if the PowerMock warnings are related but that's the
>> only
>>>>> thing interesting in the output before "timeout exceeded".
>>>>>
>>>>> On Wed, Aug 7, 2019 at 9:43 AM Murtuza Boxwala <mboxwala@pivotal.io>
>>>> wrote:
>>>>>> How down know these warnings are related to the builds hanging.
>> When
>>> I
>>>>>> was on CIO duty a couple days ago, I remember seeing this warning
>> in a
>>>>>> failing build, but then I looked back at passing builds on saw this
>> as
>>>> well.
>>>>>>> On Aug 7, 2019, at 12:40 PM, Kirk Lund <klund@apache.org>
wrote:
>>>>>>>
>>>>>>> The build is broken in CI right now (for main CI and PRs). The
>>> UnitTest
>>>>>>> jobs are timing out so I assume there's a hang of some sort.
>>>>>>>
>>>>>>> The WARNINGs appear to be related to PowerMock and begins with
"An
>>>>>> illegal
>>>>>>> reflective access operation" in geode-assembly:test.
>>>>>>>
>>>>>>> I'm running unit tests locally and haven't hit this yet, but
maybe
>> we
>>>>>>> should consider disabling all PowerMock tests and increase priority
>>> on
>>>>>>> re-enabling them without PowerMock?
>>>>>>>
>>>>>>>> Task :geode-assembly:test
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>>
>> (file:/home/geode/.gradle/caches/modules-2/files-2.1/org.mockito/mockito-core/2.23.0/497ddb32fd5d01f9dbe99a2ec790aeb931dff1b1/mockito-core-2.23.0.jar)
>>>>>>> to field java.io.File.path
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>>
>> (file:/home/geode/.gradle/caches/modules-2/files-2.1/org.mockito/mockito-core/2.23.0/497ddb32fd5d01f9dbe99a2ec790aeb931dff1b1/mockito-core-2.23.0.jar)
>>>>>>> to field java.io.File.path
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>>> Task :geode-core:test
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.apache.geode.distributed.internal.deadlock.UnsafeThreadLocal
>>>>>>> (file:/home/geode/geode/geode-core/build/classes/java/main/)
to
>>> method
>>>>>>> java.lang.ThreadLocal.getMap(java.lang.Thread)
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.apache.geode.distributed.internal.deadlock.UnsafeThreadLocal
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>>
>> (file:/home/geode/.gradle/caches/modules-2/files-2.1/org.mockito/mockito-core/2.23.0/497ddb32fd5d01f9dbe99a2ec790aeb931dff1b1/mockito-core-2.23.0.jar)
>>>>>>> to field java.util.Date.fastTime
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.mockito.internal.util.reflection.AccessibilityChanger
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.powermock.reflect.internal.WhiteboxImpl
>>>>>>>
>> (file:/home/geode/.gradle/caches/modules-2/files-2.1/org.powermock/powermock-reflect/2.0.0-beta.5/4ea415348f15620783a1f26343d6732adfa86bc8/powermock-reflect-2.0.0-beta.5.jar)
>>>>>>> to method java.lang.Object.finalize()
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.powermock.reflect.internal.WhiteboxImpl
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>> WARNING: An illegal reflective access operation has occurred
>>>>>>> WARNING: Illegal reflective access by
>>>>>>> org.junit.contrib.java.lang.system.EnvironmentVariables
>>>>>>>
>> (file:/home/geode/.gradle/caches/modules-2/files-2.1/com.github.stefanbirkner/system-rules/1.19.0/d541c9a1cff0dda32e2436c74562e2e4aa6c88cd/system-rules-1.19.0.jar)
>>>>>>> to field java.util.Collections$UnmodifiableMap.m
>>>>>>> WARNING: Please consider reporting this to the maintainers of
>>>>>>> org.junit.contrib.java.lang.system.EnvironmentVariables
>>>>>>> WARNING: Use --illegal-access=warn to enable warnings of further
>>>> illegal
>>>>>>> reflective access operations
>>>>>>> WARNING: All illegal access operations will be denied in a future
>>>> release
>>>>>>> timeout exceeded
>>>>>>
>>>>

Mime
View raw message