flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Build get stuck at BarrierBufferMassiveRandomTest
Date Wed, 23 Sep 2015 14:09:24 GMT
Okay, will look into this is a bit today...

On Wed, Sep 23, 2015 at 4:04 PM, Ufuk Celebi <uce@apache.org> wrote:

> Same here.
>
> > On 23 Sep 2015, at 13:50, Vasiliki Kalavri <vasilikikalavri@gmail.com>
> wrote:
> >
> > Hi,
> >
> > It's the latest master I'm trying to build, but it still hangs.
> > Here's the trace:
> >
> > -----------------------------
> > 2015-09-23 13:48:41
> > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed
> mode):
> >
> > "Attach Listener" daemon prio=5 tid=0x00007faeb984a000 nid=0x3707 waiting
> > on condition [0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> > "Service Thread" daemon prio=5 tid=0x00007faeb9808000 nid=0x4d03 runnable
> > [0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> > "C2 CompilerThread1" daemon prio=5 tid=0x00007faebb00e800 nid=0x4b03
> > waiting on condition [0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> > "C2 CompilerThread0" daemon prio=5 tid=0x00007faebb840800 nid=0x4903
> > waiting on condition [0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> > "Signal Dispatcher" daemon prio=5 tid=0x00007faeba806800 nid=0x3d0f
> > runnable [0x0000000000000000]
> >   java.lang.Thread.State: RUNNABLE
> >
> > "Finalizer" daemon prio=5 tid=0x00007faebb836800 nid=0x3303 in
> > Object.wait() [0x000000014eff8000]
> >   java.lang.Thread.State: WAITING (on object monitor)
> > at java.lang.Object.wait(Native Method)
> > - waiting on <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
> > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> > - locked <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
> > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> >
> > "Reference Handler" daemon prio=5 tid=0x00007faebb004000 nid=0x3103 in
> > Object.wait() [0x000000014eef5000]
> >   java.lang.Thread.State: WAITING (on object monitor)
> > at java.lang.Object.wait(Native Method)
> > - waiting on <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
> > at java.lang.Object.wait(Object.java:503)
> > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
> > - locked <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
> >
> > "main" prio=5 tid=0x00007faeb9009800 nid=0xd03 runnable
> [0x000000010f1c0000]
> >   java.lang.Thread.State: RUNNABLE
> > at java.net.PlainSocketImpl.socketAccept(Native Method)
> > at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> > at java.net.ServerSocket.implAccept(ServerSocket.java:530)
> > at java.net.ServerSocket.accept(ServerSocket.java:498)
> > at
> >
> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> >
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> > at
> >
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> > at
> >
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> > at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> > at
> >
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> > at
> >
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> > at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> > at
> >
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> > at
> >
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> > at
> >
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> > at
> >
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> > at
> >
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> > at
> >
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> > at
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> >
> > "VM Thread" prio=5 tid=0x00007faebb82e800 nid=0x2f03 runnable
> >
> > "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007faeb9806800 nid=0x1e03
> > runnable
> >
> > "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007faebb000000 nid=0x2103
> > runnable
> >
> > "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007faebb001000 nid=0x2303
> > runnable
> >
> > "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007faebb001800 nid=0x2503
> > runnable
> >
> > "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007faebb002000 nid=0x2703
> > runnable
> >
> > "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007faebb002800 nid=0x2903
> > runnable
> >
> > "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007faebb003800 nid=0x2b03
> > runnable
> >
> > "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007faeb9809000 nid=0x2d03
> > runnable
> >
> > "VM Periodic Task Thread" prio=5 tid=0x00007faeb980e000 nid=0x4f03
> waiting
> > on condition
> >
> > JNI global references: 195
> >
> >
> >
> >
> > On 23 September 2015 at 13:35, Stephan Ewen <sewen@apache.org> wrote:
> >
> >> I have pushed it, yes. If you rebase onto the latest master, it should
> >> work.
> >>
> >> If you can verify that it still hangs, can you post a stack trace dump?
> >>
> >> Thanks,
> >> Stephan
> >>
> >>
> >> On Wed, Sep 23, 2015 at 12:37 PM, Vasiliki Kalavri <
> >> vasilikikalavri@gmail.com> wrote:
> >>
> >>> @Stephan, have you pushed that fix for SocketClientSinkTest? Local
> builds
> >>> still hang for me :S
> >>>
> >>> On 21 September 2015 at 22:55, Vasiliki Kalavri <
> >> vasilikikalavri@gmail.com
> >>>>
> >>> wrote:
> >>>
> >>>> Yes, you're right. BarrierBufferMassiveRandomTest has actually
> finished
> >>>> :-)
> >>>> Sorry for the confusion! I'll wait for your fix then, thanks!
> >>>>
> >>>> On 21 September 2015 at 22:51, Stephan Ewen <sewen@apache.org>
wrote:
> >>>>
> >>>>> I am actually very happy that it is not the
> >>>>> "BarrierBufferMassiveRandomTest", that would be hell to debug...
> >>>>>
> >>>>> On Mon, Sep 21, 2015 at 10:51 PM, Stephan Ewen <sewen@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> Ah, actually it is a different test. I think you got confused
by the
> >>>>>> sysout log, because multiple parallel tests print there (that
makes
> >> it
> >>>>> not
> >>>>>> always obvious which one hangs).
> >>>>>>
> >>>>>> The test is the "SocketClientSinkTest.testSocketSinkRetryAccess()"
> >>> test.
> >>>>>> You can see that by looking in which test case the "main" thread
is
> >>>>> stuck,
> >>>>>>
> >>>>>> This test is very unstable, but, fortunately, I made a fix 1h
ago
> >> and
> >>> it
> >>>>>> is being tested on Travis right now :-)
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Stephan
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Sep 21, 2015 at 10:23 PM, Vasiliki Kalavri <
> >>>>>> vasilikikalavri@gmail.com> wrote:
> >>>>>>
> >>>>>>> Locally yes.
> >>>>>>>
> >>>>>>> Here's the stack trace:
> >>>>>>>
> >>>>>>>
> >>>>>>> 2015-09-21 22:22:46
> >>>>>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04
mixed
> >>>>> mode):
> >>>>>>>
> >>>>>>> "Attach Listener" daemon prio=5 tid=0x00007ff9d104e800 nid=0x4013
> >>>>> waiting
> >>>>>>> on condition [0x0000000000000000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>>
> >>>>>>> "Service Thread" daemon prio=5 tid=0x00007ff9d3807000 nid=0x4c03
> >>>>> runnable
> >>>>>>> [0x0000000000000000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>>
> >>>>>>> "C2 CompilerThread1" daemon prio=5 tid=0x00007ff9d2001000
> >> nid=0x4a03
> >>>>>>> waiting on condition [0x0000000000000000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>>
> >>>>>>> "C2 CompilerThread0" daemon prio=5 tid=0x00007ff9d201e000
> >> nid=0x4803
> >>>>>>> waiting on condition [0x0000000000000000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>>
> >>>>>>> "Signal Dispatcher" daemon prio=5 tid=0x00007ff9d3012800
nid=0x451b
> >>>>>>> runnable [0x0000000000000000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>>
> >>>>>>> "Finalizer" daemon prio=5 tid=0x00007ff9d4005800 nid=0x3303
in
> >>>>>>> Object.wait() [0x000000011430d000]
> >>>>>>>   java.lang.Thread.State: WAITING (on object monitor)
> >>>>>>> at java.lang.Object.wait(Native Method)
> >>>>>>> - waiting on <0x00000007ef504858> (a
> >>> java.lang.ref.ReferenceQueue$Lock)
> >>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> >>>>>>> - locked <0x00000007ef504858> (a java.lang.ref.ReferenceQueue$Lock)
> >>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> >>>>>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> >>>>>>>
> >>>>>>> "Reference Handler" daemon prio=5 tid=0x00007ff9d480b000
nid=0x3103
> >>> in
> >>>>>>> Object.wait() [0x000000011420a000]
> >>>>>>>   java.lang.Thread.State: WAITING (on object monitor)
> >>>>>>> at java.lang.Object.wait(Native Method)
> >>>>>>> - waiting on <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
> >>>>>>> at java.lang.Object.wait(Object.java:503)
> >>>>>>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
> >>>>>>> - locked <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
> >>>>>>>
> >>>>>>> "main" prio=5 tid=0x00007ff9d4800000 nid=0xd03 runnable
> >>>>>>> [0x000000010b764000]
> >>>>>>>   java.lang.Thread.State: RUNNABLE
> >>>>>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
> >>>>>>> at
> >>>>>>>
> >>>>>
> >>>
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> >>>>>>> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
> >>>>>>> at java.net.ServerSocket.accept(ServerSocket.java:498)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> >>>>>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> >>>>>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> >>>>>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> >>>>>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> >>>>>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> >>>>>>> at
> >> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> >>>>>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> >>>>>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> >>>>>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> >>>>>>> at
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> >>>>>>> at
> >>>>>>>
> >>>>>
> >>>
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> >>>>>>>
> >>>>>>> "VM Thread" prio=5 tid=0x00007ff9d4005000 nid=0x2f03 runnable
> >>>>>>>
> >>>>>>> "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007ff9d2005800
> >>>>> nid=0x1f03
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007ff9d1800000
> >>>>> nid=0x2103
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007ff9d1804800
> >>>>> nid=0x2303
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007ff9d1805000
> >>>>> nid=0x2503
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007ff9d1805800
> >>>>> nid=0x2703
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007ff9d1806800
> >>>>> nid=0x2903
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007ff9d1807000
> >>>>> nid=0x2b03
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007ff9d1807800
> >>>>> nid=0x2d03
> >>>>>>> runnable
> >>>>>>>
> >>>>>>> "VM Periodic Task Thread" prio=5 tid=0x00007ff9d1006000
nid=0x4e03
> >>>>> waiting
> >>>>>>> on condition
> >>>>>>>
> >>>>>>> JNI global references: 193
> >>>>>>>
> >>>>>>>
> >>>>>>> On 21 September 2015 at 22:13, Stephan Ewen <sewen@apache.org>
> >>> wrote:
> >>>>>>>
> >>>>>>>> This happened locally on your machine?
> >>>>>>>>
> >>>>>>>> Can you dump the stack-trace and post it? "jps <processid>
>
> >>>>>>>> stacktrace.txt" or so...
> >>>>>>>>
> >>>>>>>> On Mon, Sep 21, 2015 at 10:09 PM, Vasiliki Kalavri <
> >>>>>>>> vasilikikalavri@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> Hi squirrels,
> >>>>>>>>>
> >>>>>>>>> I've been meaning to merge a PR (#1520), but my
local maven
> >> build
> >>>>> gets
> >>>>>>>>> stuck at
> >>>>>>>>>
> >>>>> org.apache.flink.streaming.runtime.io.BarrierBufferMassiveRandomTest.
> >>>>>>>>> It looks like a deadlock.. The build just hangs
there and top
> >>>>> shows no
> >>>>>>>>> CPU/memory load. Anyone else has experienced the
same? I'm on
> >> OS
> >>> X
> >>>>>>> 10.10.
> >>>>>>>>>
> >>>>>>>>> Thanks!
> >>>>>>>>> -Vasia.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message