flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Current master broken?
Date Sun, 15 Mar 2015 16:07:01 GMT
Waiting for travis to give me the green light, then I'll push the fix...

On Sun, Mar 15, 2015 at 5:04 PM, Robert Metzger <rmetzger@apache.org> wrote:

> I think the issue is that our tests are executed on travis machines with
> different physical CPU core counts.
>
> I've pushed a 5 days old commit (
>
> https://github.com/rmetzger/flink/commit/b4e8350f52c81704ffc726a1689bb0dc7180776d
> )
> to travis, and it also failed with that issue:
> https://travis-ci.org/rmetzger/flink/builds/54443951
>
> Thanks for resolving the issue so quickly Stephan!
>
> On Sun, Mar 15, 2015 at 4:06 PM, Andra Lungu <lungu.andra@gmail.com>
> wrote:
>
> > Hi Stephan,
> >
> > The degree of parallelism was manually set there.
> MultipleProgramsTestBase
> > cannot be extended; Ufuk explained why.
> >
> > But I see that for the latest travis check, that test passed.
> > https://github.com/apache/flink/pull/475
> >
> > On Sun, Mar 15, 2015 at 3:54 PM, Stephan Ewen <sewen@apache.org> wrote:
> >
> > > Cause of the Failures:
> > >
> > > The tests in DegreesWithExceptionITCase use the context execution
> > > environment without extending a test base. This context environment
> > > instantiates a local excution environment with a parallelism equal to
> the
> > > number of cores. Since on travis, build run in containers on big
> > machines,
> > > the number of cores may be very high 32/64 - this causes the tests to
> run
> > > out of network buffers, with the default configuration.
> > >
> > >
> > > IMPORTANT: Please make sure that all tests in the future either use one
> > of
> > > the test base classes (that define a reasonable parallelism), or define
> > the
> > > parallelism manually to be safe!
> > >
> > > On Sun, Mar 15, 2015 at 3:43 PM, Stephan Ewen <sewen@apache.org>
> wrote:
> > >
> > > > It seems that the current master is broken, with respect to the
> tests.
> > > >
> > > > I see all build on Travis consistently failing, in the gelly project.
> > > > Since Travis is a bit behind in the "apache" account, I triggered a
> > build
> > > > in my own account. The hash is the same, it should contain the master
> > > from
> > > > yesterday.
> > > >
> > > > https://travis-ci.org/StephanEwen/incubator-flink/builds/54386416
> > > >
> > > > In all executions it results in the stack trace below. I cannot
> > reproduce
> > > > the problem locally, unfortunately.
> > > >
> > > > This is a serious issue, it totally kills the testability.
> > > >
> > > > Results :
> > > >
> > > > Failed tests:
> > > >   DegreesWithExceptionITCase.testGetDegreesInvalidEdgeSrcId:113
> > > expected:<[The edge src/trg id could not be found within the
> vertexIds]>
> > > but was:<[Failed to deploy the task Reduce(SUM(1), at
> > > getDegrees(Graph.java:664) (30/32) - execution #0 to slot SimpleSlot
> > (2)(2)
> > > - 31624115d75feb2c387ae9043021d8e6 - ALLOCATED/ALIVE:
> > java.io.IOException:
> > > Insufficient number of network buffers: required 32, but only 2
> > available.
> > > The total number of network buffers is currently set to 2048. You can
> > > increase this number by setting the configuration key
> > > 'taskmanager.network.numberOfBuffers'.
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:158)
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:163)
> > > >       at org.apache.flink.runtime.taskmanager.TaskManager.org
> > >
> >
> $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:454)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:237)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
> > > >       at
> > > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
> > > >       at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:91)
> > > >       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > >       at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> > > >       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > >       at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > >       at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > >       at
> > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> > > >       at
> > >
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > ]>
> > > >   DegreesWithExceptionITCase.testGetDegreesInvalidEdgeTrgId:92
> > > expected:<[The edge src/trg id could not be found within the
> vertexIds]>
> > > but was:<[Failed to deploy the task CoGroup (CoGroup at
> > > inDegrees(Graph.java:655)) (29/32) - execution #0 to slot SimpleSlot
> > (1)(3)
> > > - 1735ca6f2fb76f9f0a0ab03ffd9c9f93 - ALLOCATED/ALIVE:
> > java.io.IOException:
> > > Insufficient number of network buffers: required 32, but only 8
> > available.
> > > The total number of network buffers is currently set to 2048. You can
> > > increase this number by setting the configuration key
> > > 'taskmanager.network.numberOfBuffers'.
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:158)
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:135)
> > > >       at org.apache.flink.runtime.taskmanager.TaskManager.org
> > >
> >
> $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:454)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:237)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
> > > >       at
> > > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
> > > >       at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:91)
> > > >       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > >       at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> > > >       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > >       at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > >       at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > >       at
> > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> > > >       at
> > >
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > ]>
> > > >   DegreesWithExceptionITCase.testGetDegreesInvalidEdgeSrcTrgId:134
> > > expected:<[The edge src/trg id could not be found within the
> vertexIds]>
> > > but was:<[Failed to deploy the task CoGroup (CoGroup at
> > > inDegrees(Graph.java:655)) (31/32) - execution #0 to slot SimpleSlot
> > (1)(3)
> > > - 3a465bdbeca9625e5d90572ed0959b1d - ALLOCATED/ALIVE:
> > java.io.IOException:
> > > Insufficient number of network buffers: required 32, but only 8
> > available.
> > > The total number of network buffers is currently set to 2048. You can
> > > increase this number by setting the configuration key
> > > 'taskmanager.network.numberOfBuffers'.
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:158)
> > > >       at
> > >
> >
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:135)
> > > >       at org.apache.flink.runtime.taskmanager.TaskManager.org
> > >
> >
> $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:454)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:237)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > >       at
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
> > > >       at
> > > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > >       at
> > >
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
> > > >       at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > >       at
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:91)
> > > >       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > >       at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> > > >       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > >       at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > >       at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > >       at
> > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> > > >       at
> > >
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > >       at
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > ]>
> > > >
> > > > Tests run: 180, Failures: 3, Errors: 0, Skipped: 0
> > > >
> > > > [INFO]
> > > > [INFO] --- maven-failsafe-plugin:2.17:verify (default) @ flink-gelly
> > ---
> > > > [INFO] Failsafe report directory:
> > >
> >
> /home/travis/build/StephanEwen/incubator-flink/flink-staging/flink-gelly/target/failsafe-reports
> > > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > > [INFO] Reactor Summary:
> > > > [INFO]
> > > > [INFO] flink .............................................. SUCCESS [
> > > 6.075 s]
> > > > [INFO] flink-shaded-hadoop ................................ SUCCESS [
> > > 1.827 s]
> > > > [INFO] flink-shaded-hadoop1 ............................... SUCCESS [
> > > 7.384 s]
> > > > [INFO] flink-core ......................................... SUCCESS [
> > > 37.973 s]
> > > > [INFO] flink-java ......................................... SUCCESS [
> > > 17.373 s]
> > > > [INFO] flink-runtime ...................................... SUCCESS
> > > [11:13 min]
> > > > [INFO] flink-compiler ..................................... SUCCESS [
> > > 7.149 s]
> > > > [INFO] flink-clients ...................................... SUCCESS [
> > > 9.130 s]
> > > > [INFO] flink-test-utils ................................... SUCCESS [
> > > 8.519 s]
> > > > [INFO] flink-scala ........................................ SUCCESS [
> > > 36.171 s]
> > > > [INFO] flink-examples ..................................... SUCCESS [
> > > 0.370 s]
> > > > [INFO] flink-java-examples ................................ SUCCESS [
> > > 2.335 s]
> > > > [INFO] flink-scala-examples ............................... SUCCESS [
> > > 25.139 s]
> > > > [INFO] flink-staging ...................................... SUCCESS [
> > > 0.093 s]
> > > > [INFO] flink-streaming .................................... SUCCESS [
> > > 0.315 s]
> > > > [INFO] flink-streaming-core ............................... SUCCESS [
> > > 9.560 s]
> > > > [INFO] flink-tests ........................................ SUCCESS
> > > [09:11 min]
> > > > [INFO] flink-avro ......................................... SUCCESS [
> > > 17.307 s]
> > > > [INFO] flink-jdbc ......................................... SUCCESS [
> > > 3.715 s]
> > > > [INFO] flink-spargel ...................................... SUCCESS [
> > > 7.141 s]
> > > > [INFO] flink-hadoop-compatibility ......................... SUCCESS [
> > > 19.508 s]
> > > > [INFO] flink-streaming-scala .............................. SUCCESS [
> > > 14.936 s]
> > > > [INFO] flink-streaming-connectors ......................... SUCCESS [
> > > 2.784 s]
> > > > [INFO] flink-streaming-examples ........................... SUCCESS [
> > > 18.787 s]
> > > > [INFO] flink-hbase ........................................ SUCCESS [
> > > 2.870 s]
> > > > [INFO] flink-gelly ........................................ FAILURE [
> > > 58.548 s]
> > > > [INFO] flink-hcatalog ..................................... SKIPPED
> > > > [INFO] flink-expressions .................................. SKIPPED
> > > > [INFO] flink-quickstart ................................... SKIPPED
> > > > [INFO] flink-quickstart-java .............................. SKIPPED
> > > > [INFO] flink-quickstart-scala ............................. SKIPPED
> > > > [INFO] flink-contrib ...................................... SKIPPED
> > > > [INFO] flink-dist ......................................... SKIPPED
> > > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > > [INFO] BUILD FAILURE
> > > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message