flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Goel <sachingoel0...@gmail.com>
Subject Re: Failing tests on Windows
Date Fri, 17 Jul 2015 16:10:15 GMT
Since the failing tests on windows have come up again, I did find some
failing tests when the community was testing the release candidates for
0.9.0 release.
Here is one of the log outputs: http://pastebin.com/raw.php?i=VWbx2ppf
These errors are on running mvn clean verify. Following were the failing
tests:

  BlobUtilsTest.before:45 null
  BlobUtilsTest.before:45 null
  BlobServerDeleteTest.testDeleteFails:291 null
  BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not
remove write permissions from cache directory
  BlobServerPutTest.testPutBufferFails:224 null
  BlobServerPutTest.testPutNamedBufferFails:286 null
  JobManagerStartupTest.before:55 null
  JobManagerStartupTest.before:55 null
  DataSinkTaskTest.testFailingDataSinkTask:317 Temp output file has
not been removed
  DataSinkTaskTest.testFailingSortingDataSinkTask:358 Temp output file
has not been removed
  TaskManagerTest.testSubmitAndExecuteTask:123 assertion failed:
timeout (19998080696 nanoseconds) during expectMsgClass waiting for
class org.apache.flink.runtime.messages.RegistrationMessages$RegisterTaskManager
  TaskManagerProcessReapingTest.testReapProcessOnFailure:133
TaskManager process did not launch the TaskManager properly. Failed to
look up akka.tcp://flink@127.0.0.1:50673/user/taskmanager


Most of these again seem related to file system permissions and time out
errors. Please see if any changes you make fix these too. It is unlikely
the final release had these fixed, because no fixes were explicitly filed
for these. If you wish, file JIRAs for these too, in case these still
persist.
Further, since the build stops at flink-runtime, I can't be sure if any
further tests wouldn't fail. I can try verify commands again 0nce there are
fixes for these.

Cheers!
Sachin

-- Sachin Goel
Computer Science, IIT Delhi
m. +91-9871457685

On Fri, Jul 17, 2015 at 9:34 PM, Stephan Ewen <sewen@apache.org> wrote:

> Yes, please open JIRAs for that.
>
> If you want to provide some fixes, increasing the timeout in (4) is
> probably reasonable.
>
> On Fri, Jul 17, 2015 at 5:53 PM, Gábor Gévay <ggab90@gmail.com> wrote:
>
> > Hello!
> >
> > I tried to setup a development environment on Windows, but several
> > tests are failing:
> >
> > 1. The setWritable problem. This will be worked around by [1]
> >
> > 2. The tryCleanupOnError before close problem [2]. This could be
> > half-fixed by doing fixing 2. in the comment I wrote there, but I
> > think that would still leave the problem open in the FileSinkFunction.
> > Should I open a PR for this?
> >
> > 3. CsvOutputFormatITCase fails with about 30% chance with
> > java.io.IOException: Unable to delete file:
> >
> >
> C:\Users\Gabor\AppData\Local\Temp\org.apache.flink.streaming.api.outputformat.CsvOutputFormatITCase-result\1
> > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
> > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> > at
> >
> org.apache.flink.test.util.TestBaseUtils.deleteRecursively(TestBaseUtils.java:508)
> > at
> >
> org.apache.flink.test.util.AbstractTestBase.deleteAllTempFiles(AbstractTestBase.java:141)
> > at
> >
> org.apache.flink.test.util.AbstractTestBase.stopCluster(AbstractTestBase.java:69)
> > at
> >
> org.apache.flink.streaming.util.StreamingProgramTestBase.testJobWithoutObjectReuse(StreamingProgramTestBase.java:118)
> > <23 internal calls>
> >
> > I guess this is also some file closing issue.
> >
> >
> > Additionally, there are some more mysterious failures which are
> > happening only from Maven, and I can't reproduce them when running a
> > test from the IDE:
> >
> > 4. testFindConnectableAddress(org.apache.flink.runtime.net.NetUtilsTest)
> >  Time elapsed: 20.936 sec  <<< FAILURE!
> > java.lang.AssertionError: null
> >         at org.junit.Assert.fail(Assert.java:86)
> >         at org.junit.Assert.assertTrue(Assert.java:41)
> >         at org.junit.Assert.assertTrue(Assert.java:52)
> >         at
> >
> org.apache.flink.runtime.net.NetUtilsTest.testFindConnectableAddress(NetUtilsTest.java:54)
> >
> > It is interesting that it is not happening from the IDE, but I think
> > this is just because it gets less CPU time when some other tests are
> > running in parallel from Maven. It takes 2-4 s from the IDE under
> > Windows, but it takes consistently very close to 2 s under Linux.
> > Maybe the 8 sec timeout could be raised under Windows? (Or what do you
> > think about trying to connect from the multiple interfaces in
> > parallel? That is, parallelizing the outer loop in
> > findAddressUsingStrategy.)
> >
> > 5. testGroupByFeedback(org.apache.flink.streaming.api.IterateTest)
> > Time elapsed: 12.091 sec  <<< ERROR!
> > org.apache.flink.runtime.client.JobExecutionException: Job execution
> > failed.
> >         at
> >
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:314)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> >         at
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> >         at
> >
> org.apache.flink.runtime.testingUtils.TestingJobManager$$anonfun$receiveTestingMessages$1.applyOrElse(TestingJobManager.scala:169)
> >         at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
> >         at
> > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> >         at
> >
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
> >         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> >         at
> >
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:93)
> >         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> >         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> >         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> >         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> >         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> >         at
> > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >         at
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >         at
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >         at
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > Caused by: java.lang.AssertionError: null
> >         at org.junit.Assert.fail(Assert.java:86)
> >         at org.junit.Assert.assertTrue(Assert.java:41)
> >         at org.junit.Assert.assertTrue(Assert.java:52)
> >         at
> > org.apache.flink.streaming.api.IterateTest$6.close(IterateTest.java:447)
> >         at
> >
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:40)
> >         at
> >
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.close(AbstractUdfStreamOperator.java:75)
> >         at
> >
> org.apache.flink.streaming.runtime.tasks.StreamTask.closeOperator(StreamTask.java:182)
> >         at
> >
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.invoke(OneInputStreamTask.java:112)
> >         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:577)
> >         at java.lang.Thread.run(Thread.java:745)
> >
> > I have no idea what goes wrong here.
> >
> > 6.
> >
> complexIntegrationTest1(org.apache.flink.streaming.api.complex.ComplexIntegrationTest)
> >  Time elapsed: 15.989 sec  <<< FAILURE!
> > java.lang.AssertionError: Different number of lines in expected and
> > obtained result. expected:<9> but was:<5>
> >         at org.junit.Assert.fail(Assert.java:88)
> >         at org.junit.Assert.failNotEquals(Assert.java:743)
> >         at org.junit.Assert.assertEquals(Assert.java:118)
> >         at org.junit.Assert.assertEquals(Assert.java:555)
> >         at
> >
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:272)
> >         at
> >
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:258)
> >         at
> >
> org.apache.flink.streaming.api.complex.ComplexIntegrationTest.after(ComplexIntegrationTest.java:91)
> >
> > This is only happening with a chance of about 30%. There is one thing
> > in the code of this test which is a little suspicios to me: all tests
> > are using the same 'resultPath' and 'expected' variables. Can it not
> > happen that Maven runs these tests in the same jvm, and thus they step
> > on each others feet?
> >
> >
> > Should I open jiras for the last four problems?
> >
> > Best regards,
> > Gabor
> >
> > [1] https://github.com/apache/flink/pull/919
> > [2] https://issues.apache.org/jira/browse/FLINK-2369
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message