flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: YARN ITCases fail, master broken?
Date Sun, 25 Jan 2015 11:55:04 GMT
My VM is configured with 6GB and the OS X host has 16GB. In both setups the
error was identical (with and without cleared .m2).
If the tests pass on a 3GB Travis host, I doubt that my errors are caused
by lack of memory.

2015-01-25 12:28 GMT+01:00 Robert Metzger <rmetzger@apache.org>:

> I also had errors when running the tests in an Ubuntu virtual machine,
> caused by limited memory resources.
>
> Do you think its okay to assume that we have at least 3 GB of main memory
> available for running the tests? Thats what we have with Travis. The YARN
> tests are very memory intensive, because YARN itself has a bunch of
> services (resource manager, 2 node managers) and we allocate some
> containers there, which all need memory.
>
> The main problem is that the amount of memory per node manager is
> hard-coded to 4GB. I've already opened an issue in the Hadoop project and
> will fix the issue there...
>
> But I think I have some ideas how to work around the memory limitations for
> now.
>
>
>
> On Sat, Jan 24, 2015 at 11:47 PM, Vasiliki Kalavri <
> vasilikikalavri@gmail.com> wrote:
>
> > Hi,
> >
> > "mvn clean verify" fails for me on Ubuntu with deleted .m2 repository.
> > I'm getting the following:
> >
> > Results :
> >
> > Failed tests:
> >   YARNSessionFIFOITCase.setup:56->YarnTestBase.startYARNWithConfig:249
> null
> >
> >
> >
> YARNSessionCapacitySchedulerITCase.setup:42->YarnTestBase.startYARNWithConfig:249
> > null
> >
> > Tests run: 2, Failures: 2, Errors: 0, Skipped: 0
> >
> > -V.
> >
> > On 24 January 2015 at 23:03, Fabian Hueske <fhueske@gmail.com> wrote:
> >
> > > The build fails also after the .m2 repository was deleted.
> > >
> > > Does anybody else have this problem?
> > >
> > > 2015-01-24 21:31 GMT+01:00 Stephan Ewen <sewen@apache.org>:
> > >
> > > > Is this reproducible on a machine when you delete the .m2/repository
> > > > directory (local maven cache) ?
> > > >
> > > > (I currently cannot try that because I am behind a rather
> low-bandwith
> > > > connection and would take very long to re-download all dependency
> > > > artifacts)
> > > >
> > > > On Sat, Jan 24, 2015 at 5:54 AM, Fabian Hueske <fhueske@gmail.com>
> > > wrote:
> > > >
> > > > > I just tried to build ("mvn clean install") on a fresh Ubuntu VM.
> > Fails
> > > > > with the same exception as natively on MacOS.
> > > > > Something strange is going on...
> > > > >
> > > > > 2015-01-24 11:19 GMT+01:00 Fabian Hueske <fhueske@gmail.com>:
> > > > >
> > > > > > Thanks Robert! Sounds indeed like an environment problem.
> > > > > > Will run the tests again and send you the output.
> > > > > >
> > > > > > 2015-01-24 11:11 GMT+01:00 Robert Metzger <rmetzger@apache.org>:
> > > > > >
> > > > > >> Okay, the tests have finished on my local machine, and they
> > passed.
> > > So
> > > > > it
> > > > > >> looks like an environment specific issue.
> > > > > >> Maybe the log helps me already to figure out whats the issue.
> > > > > >> We should make sure that our tests are passing on all platforms
> ;)
> > > > > >>
> > > > > >> On Sat, Jan 24, 2015 at 11:06 AM, Robert Metzger <
> > > rmetzger@apache.org
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > the tests are passing on travis. Maybe its a issue
with your
> > > > > >> environment.
> > > > > >> > I'm currently running the tests on my machine as well,
just to
> > > make
> > > > > >> sure.
> > > > > >> > I haven't ran the tests on OS X, maybe that's causing
the
> > issues.
> > > > > >> >
> > > > > >> > Can you send me (privately) the full output of the
tests?
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > Robert
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > On Sat, Jan 24, 2015 at 11:00 AM, Fabian Hueske <
> > > fhueske@gmail.com>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> Hi Henry,
> > > > > >> >>
> > > > > >> >> running "mvn -DskipTests clean install" before
"mvn clean
> > > install"
> > > > > did
> > > > > >> not
> > > > > >> >> fix the build for me.
> > > > > >> >> The failing tests are also integration tests (*ITCase)
which
> > are
> > > > only
> > > > > >> >> executed in Maven's verify phase which is not triggered
if
> you
> > > run
> > > > > "mvn
> > > > > >> >> clean test".
> > > > > >> >> If I run "mvn test" without "mvn install" it fails
for me as
> > well
> > > > > with
> > > > > >> the
> > > > > >> >> error you posted.
> > > > > >> >>
> > > > > >> >> So there seem to be at least two build issues with
the
> current
> > > > > master.
> > > > > >> >>
> > > > > >> >> 2015-01-24 1:47 GMT+01:00 Henry Saputra <
> > henry.saputra@gmail.com
> > > >:
> > > > > >> >>
> > > > > >> >> > Hmm, I think there could be some weird dependencies
to get
> > the
> > > > > Flink
> > > > > >> >> > YARN uber jar.
> > > > > >> >> >
> > > > > >> >> > If you do "mvn clean install -DskipTests"
then call "mvn
> > test"
> > > > all
> > > > > >> the
> > > > > >> >> > tests passed.
> > > > > >> >> >
> > > > > >> >> > But if you directly call "mvn clean test"
then you see the
> > > stack
> > > > I
> > > > > >> >> > have seen before.
> > > > > >> >> >
> > > > > >> >> > - Henry
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> > On Fri, Jan 23, 2015 at 3:35 PM, Henry Saputra
<
> > > > > >> henry.saputra@gmail.com
> > > > > >> >> >
> > > > > >> >> > wrote:
> > > > > >> >> > > Did not see that trace but do see this:
> > > > > >> >> > >
> > > > > >> >> > > -------------------------------------------------------
> > > > > >> >> > >
> > > > > >> >> > >  T E S T S
> > > > > >> >> > >
> > > > > >> >> > > -------------------------------------------------------
> > > > > >> >> > >
> > > > > >> >> > > Running org.apache.flink.yarn.UtilsTest
> > > > > >> >> > >
> > > > > >> >> > > log4j:WARN No such property [append]
in
> > > > > >> >> org.apache.log4j.ConsoleAppender.
> > > > > >> >> > >
> > > > > >> >> > > Tests run: 1, Failures: 1, Errors: 0,
Skipped: 0, Time
> > > elapsed:
> > > > > >> 0.476
> > > > > >> >> > > sec <<< FAILURE! - in org.apache.flink.yarn.UtilsTest
> > > > > >> >> > >
> > > > > >> >> > > testUberjarLocator(org.apache.flink.yarn.UtilsTest)
 Time
> > > > > elapsed:
> > > > > >> >> > > 0.405 sec  <<< FAILURE!
> > > > > >> >> > >
> > > > > >> >> > > java.lang.AssertionError: null
> > > > > >> >> > >
> > > > > >> >> > > at org.junit.Assert.fail(Assert.java:86)
> > > > > >> >> > >
> > > > > >> >> > > at org.junit.Assert.assertTrue(Assert.java:41)
> > > > > >> >> > >
> > > > > >> >> > > at org.junit.Assert.assertNotNull(Assert.java:621)
> > > > > >> >> > >
> > > > > >> >> > > at org.junit.Assert.assertNotNull(Assert.java:631)
> > > > > >> >> > >
> > > > > >> >> > > at
> > > > > >> >>
> > > > org.apache.flink.yarn.UtilsTest.testUberjarLocator(UtilsTest.java:32)
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > Results :
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > Failed tests:
> > > > > >> >> > >
> > > > > >> >> > >   UtilsTest.testUberjarLocator:32 null
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > Tests run: 1, Failures: 1, Errors: 0,
Skipped: 0
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > - Henry
> > > > > >> >> > >
> > > > > >> >> > > On Fri, Jan 23, 2015 at 2:16 PM, Fabian
Hueske <
> > > > > fhueske@gmail.com>
> > > > > >> >> > wrote:
> > > > > >> >> > >> Hi all,
> > > > > >> >> > >>
> > > > > >> >> > >> I tried to build the current master
(mvn clean install)
> > and
> > > > some
> > > > > >> >> tests
> > > > > >> >> > in
> > > > > >> >> > >> the flink-yarn-tests module fail:
> > > > > >> >> > >>
> > > > > >> >> > >> Failed tests:
> > > > > >> >> > >>
> > > > > >> >> > >>
> > > > > >> >> > >>
> > > > > >> >> >
> > > > > >> >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> YARNSessionCapacitySchedulerITCase.testClientStartup:50->YarnTestBase.runWithArgs:314
> > > > > >> >> > >> During the timeout period of 60 seconds
the expected
> > string
> > > > did
> > > > > >> not
> > > > > >> >> > show up
> > > > > >> >> > >>
> > > > > >> >> > >>
> > > > > >> >>
> > > > >
> YARNSessionCapacitySchedulerITCase>YarnTestBase.checkClusterEmpty:146
> > > > > >> >> > >> There is at least one application
on the cluster is not
> > > > finished
> > > > > >> >> > >>
> > > > > >> >> > >>
> > > > > >> >> >
> > > > > >> >>
> > > > > >>
> > > > >
> > >
> YARNSessionFIFOITCase.perJobYarnCluster:184->YarnTestBase.runWithArgs:314
> > > > > >> >> > >> During the timeout period of 60 seconds
the expected
> > string
> > > > did
> > > > > >> not
> > > > > >> >> > show up
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
>  YARNSessionFIFOITCase>YarnTestBase.checkClusterEmpty:146
> > > > There
> > > > > >> is
> > > > > >> >> at
> > > > > >> >> > >> least one application on the cluster
is not finished
> > > > > >> >> > >>
> > > > > >> >> > >>
> > > > > >> >> > >> Tests run: 10, Failures: 10, Errors:
0, Skipped: 0
> > > > > >> >> > >>
> > > > > >> >> > >> Anybody else got this problem?
> > > > > >> >> > >>
> > > > > >> >> > >> Cheers, Fabian
> > > > > >> >> >
> > > > > >> >>
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message