flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paris Carbone <par...@kth.se>
Subject Re: Build get stuck at BarrierBufferMassiveRandomTest
Date Wed, 23 Sep 2015 14:28:04 GMT
It hangs for me too at the same test when doing "clean verify"

> On 23 Sep 2015, at 16:09, Stephan Ewen <sewen@apache.org> wrote:
> 
> Okay, will look into this is a bit today...
> 
> On Wed, Sep 23, 2015 at 4:04 PM, Ufuk Celebi <uce@apache.org> wrote:
> 
>> Same here.
>> 
>>> On 23 Sep 2015, at 13:50, Vasiliki Kalavri <vasilikikalavri@gmail.com>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> It's the latest master I'm trying to build, but it still hangs.
>>> Here's the trace:
>>> 
>>> -----------------------------
>>> 2015-09-23 13:48:41
>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed
>> mode):
>>> 
>>> "Attach Listener" daemon prio=5 tid=0x00007faeb984a000 nid=0x3707 waiting
>>> on condition [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "Service Thread" daemon prio=5 tid=0x00007faeb9808000 nid=0x4d03 runnable
>>> [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "C2 CompilerThread1" daemon prio=5 tid=0x00007faebb00e800 nid=0x4b03
>>> waiting on condition [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "C2 CompilerThread0" daemon prio=5 tid=0x00007faebb840800 nid=0x4903
>>> waiting on condition [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "Signal Dispatcher" daemon prio=5 tid=0x00007faeba806800 nid=0x3d0f
>>> runnable [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "Finalizer" daemon prio=5 tid=0x00007faebb836800 nid=0x3303 in
>>> Object.wait() [0x000000014eff8000]
>>>  java.lang.Thread.State: WAITING (on object monitor)
>>> at java.lang.Object.wait(Native Method)
>>> - waiting on <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>>> - locked <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
>>> 
>>> "Reference Handler" daemon prio=5 tid=0x00007faebb004000 nid=0x3103 in
>>> Object.wait() [0x000000014eef5000]
>>>  java.lang.Thread.State: WAITING (on object monitor)
>>> at java.lang.Object.wait(Native Method)
>>> - waiting on <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
>>> at java.lang.Object.wait(Object.java:503)
>>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>>> - locked <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
>>> 
>>> "main" prio=5 tid=0x00007faeb9009800 nid=0xd03 runnable
>> [0x000000010f1c0000]
>>>  java.lang.Thread.State: RUNNABLE
>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
>>> at
>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>>> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>>> at java.net.ServerSocket.accept(ServerSocket.java:498)
>>> at
>>> 
>> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>> 
>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>>> at
>>> 
>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> at
>>> 
>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>>> at
>>> 
>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>>> at
>>> 
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>>> at
>>> 
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>>> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>> at
>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>>> at
>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>>> at
>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>> at
>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>>> at
>>> 
>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>>> at
>>> 
>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>>> at
>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>> 
>>> "VM Thread" prio=5 tid=0x00007faebb82e800 nid=0x2f03 runnable
>>> 
>>> "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007faeb9806800 nid=0x1e03
>>> runnable
>>> 
>>> "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007faebb000000 nid=0x2103
>>> runnable
>>> 
>>> "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007faebb001000 nid=0x2303
>>> runnable
>>> 
>>> "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007faebb001800 nid=0x2503
>>> runnable
>>> 
>>> "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007faebb002000 nid=0x2703
>>> runnable
>>> 
>>> "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007faebb002800 nid=0x2903
>>> runnable
>>> 
>>> "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007faebb003800 nid=0x2b03
>>> runnable
>>> 
>>> "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007faeb9809000 nid=0x2d03
>>> runnable
>>> 
>>> "VM Periodic Task Thread" prio=5 tid=0x00007faeb980e000 nid=0x4f03
>> waiting
>>> on condition
>>> 
>>> JNI global references: 195
>>> 
>>> 
>>> 
>>> 
>>> On 23 September 2015 at 13:35, Stephan Ewen <sewen@apache.org> wrote:
>>> 
>>>> I have pushed it, yes. If you rebase onto the latest master, it should
>>>> work.
>>>> 
>>>> If you can verify that it still hangs, can you post a stack trace dump?
>>>> 
>>>> Thanks,
>>>> Stephan
>>>> 
>>>> 
>>>> On Wed, Sep 23, 2015 at 12:37 PM, Vasiliki Kalavri <
>>>> vasilikikalavri@gmail.com> wrote:
>>>> 
>>>>> @Stephan, have you pushed that fix for SocketClientSinkTest? Local
>> builds
>>>>> still hang for me :S
>>>>> 
>>>>> On 21 September 2015 at 22:55, Vasiliki Kalavri <
>>>> vasilikikalavri@gmail.com
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Yes, you're right. BarrierBufferMassiveRandomTest has actually
>> finished
>>>>>> :-)
>>>>>> Sorry for the confusion! I'll wait for your fix then, thanks!
>>>>>> 
>>>>>> On 21 September 2015 at 22:51, Stephan Ewen <sewen@apache.org>
wrote:
>>>>>> 
>>>>>>> I am actually very happy that it is not the
>>>>>>> "BarrierBufferMassiveRandomTest", that would be hell to debug...
>>>>>>> 
>>>>>>> On Mon, Sep 21, 2015 at 10:51 PM, Stephan Ewen <sewen@apache.org>
>>>>> wrote:
>>>>>>> 
>>>>>>>> Ah, actually it is a different test. I think you got confused
by the
>>>>>>>> sysout log, because multiple parallel tests print there (that
makes
>>>> it
>>>>>>> not
>>>>>>>> always obvious which one hangs).
>>>>>>>> 
>>>>>>>> The test is the "SocketClientSinkTest.testSocketSinkRetryAccess()"
>>>>> test.
>>>>>>>> You can see that by looking in which test case the "main"
thread is
>>>>>>> stuck,
>>>>>>>> 
>>>>>>>> This test is very unstable, but, fortunately, I made a fix
1h ago
>>>> and
>>>>> it
>>>>>>>> is being tested on Travis right now :-)
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Stephan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Sep 21, 2015 at 10:23 PM, Vasiliki Kalavri <
>>>>>>>> vasilikikalavri@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Locally yes.
>>>>>>>>> 
>>>>>>>>> Here's the stack trace:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2015-09-21 22:22:46
>>>>>>>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04
mixed
>>>>>>> mode):
>>>>>>>>> 
>>>>>>>>> "Attach Listener" daemon prio=5 tid=0x00007ff9d104e800
nid=0x4013
>>>>>>> waiting
>>>>>>>>> on condition [0x0000000000000000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> 
>>>>>>>>> "Service Thread" daemon prio=5 tid=0x00007ff9d3807000
nid=0x4c03
>>>>>>> runnable
>>>>>>>>> [0x0000000000000000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> 
>>>>>>>>> "C2 CompilerThread1" daemon prio=5 tid=0x00007ff9d2001000
>>>> nid=0x4a03
>>>>>>>>> waiting on condition [0x0000000000000000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> 
>>>>>>>>> "C2 CompilerThread0" daemon prio=5 tid=0x00007ff9d201e000
>>>> nid=0x4803
>>>>>>>>> waiting on condition [0x0000000000000000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> 
>>>>>>>>> "Signal Dispatcher" daemon prio=5 tid=0x00007ff9d3012800
nid=0x451b
>>>>>>>>> runnable [0x0000000000000000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> 
>>>>>>>>> "Finalizer" daemon prio=5 tid=0x00007ff9d4005800 nid=0x3303
in
>>>>>>>>> Object.wait() [0x000000011430d000]
>>>>>>>>>  java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>>>> - waiting on <0x00000007ef504858> (a
>>>>> java.lang.ref.ReferenceQueue$Lock)
>>>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>>>>>>>>> - locked <0x00000007ef504858> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>>>>>>>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
>>>>>>>>> 
>>>>>>>>> "Reference Handler" daemon prio=5 tid=0x00007ff9d480b000
nid=0x3103
>>>>> in
>>>>>>>>> Object.wait() [0x000000011420a000]
>>>>>>>>>  java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>>>> - waiting on <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
>>>>>>>>> at java.lang.Object.wait(Object.java:503)
>>>>>>>>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>>>>>>>>> - locked <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
>>>>>>>>> 
>>>>>>>>> "main" prio=5 tid=0x00007ff9d4800000 nid=0xd03 runnable
>>>>>>>>> [0x000000010b764000]
>>>>>>>>>  java.lang.Thread.State: RUNNABLE
>>>>>>>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>> 
>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>>>>>>>>> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>>>>>>>>> at java.net.ServerSocket.accept(ServerSocket.java:498)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>>>>>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>>>>>>>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>>>>>>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>>>>>>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>>>>>>>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>>>>>>>>> at
>>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>>>>>>>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>>>>>>>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>>>>>>>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>> 
>>>>> 
>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>>>>>>>> 
>>>>>>>>> "VM Thread" prio=5 tid=0x00007ff9d4005000 nid=0x2f03
runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007ff9d2005800
>>>>>>> nid=0x1f03
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007ff9d1800000
>>>>>>> nid=0x2103
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007ff9d1804800
>>>>>>> nid=0x2303
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007ff9d1805000
>>>>>>> nid=0x2503
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007ff9d1805800
>>>>>>> nid=0x2703
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007ff9d1806800
>>>>>>> nid=0x2903
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007ff9d1807000
>>>>>>> nid=0x2b03
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007ff9d1807800
>>>>>>> nid=0x2d03
>>>>>>>>> runnable
>>>>>>>>> 
>>>>>>>>> "VM Periodic Task Thread" prio=5 tid=0x00007ff9d1006000
nid=0x4e03
>>>>>>> waiting
>>>>>>>>> on condition
>>>>>>>>> 
>>>>>>>>> JNI global references: 193
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 21 September 2015 at 22:13, Stephan Ewen <sewen@apache.org>
>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> This happened locally on your machine?
>>>>>>>>>> 
>>>>>>>>>> Can you dump the stack-trace and post it? "jps <processid>
>
>>>>>>>>>> stacktrace.txt" or so...
>>>>>>>>>> 
>>>>>>>>>> On Mon, Sep 21, 2015 at 10:09 PM, Vasiliki Kalavri
<
>>>>>>>>>> vasilikikalavri@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi squirrels,
>>>>>>>>>>> 
>>>>>>>>>>> I've been meaning to merge a PR (#1520), but
my local maven
>>>> build
>>>>>>> gets
>>>>>>>>>>> stuck at
>>>>>>>>>>> 
>>>>>>> org.apache.flink.streaming.runtime.io.BarrierBufferMassiveRandomTest.
>>>>>>>>>>> It looks like a deadlock.. The build just hangs
there and top
>>>>>>> shows no
>>>>>>>>>>> CPU/memory load. Anyone else has experienced
the same? I'm on
>>>> OS
>>>>> X
>>>>>>>>> 10.10.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks!
>>>>>>>>>>> -Vasia.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>> 


Mime
View raw message