hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Antonov <olorinb...@gmail.com>
Subject Re: Planning to roll the 0.98.4 RC on 6/30
Date Thu, 26 Jun 2014 23:59:55 GMT
And if you disable forking completely, do the tests pass for you always, or
they also fail intermittently?


2014-06-26 15:59 GMT-07:00 Andrew Purtell <apurtell@apache.org>:

> Additionally we run unit tests in parallel to reduce the total time
> required for test suite execution. Surefire will fork multiple JVMs,
> dynamically generate test jars containing a subset of tests, and run them.
> That can make isolating hanging tests difficult but this behavior can be
> influenced by defines on the Maven command line. For example, to fork a
> process for every single unit test:
>
>     mvn test -Dsurefire.firstPartForkMode=always
> -Dsurefire.secondPartForkMode=always
>
> And then if you find a hanging surefire runner, you can dump thread stacks
> of that JVM and know only the unit test you find methods of in the stacks
> contributed to the current wedged state.
>
>
> On Thu, Jun 26, 2014 at 3:48 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > Java 7u60 64-bit on an EC2 m3.4xlarge. Just running the unit test suite
> in
> > a loop. I don't set any special Maven options in MVN_OPTS or anything
> like
> > that.
> >
> > Historically failures that occur when the suite executes but do not when
> > individual tests pass happen because one test does not shut down in a
> > timely manner, or at all, and a subsequent test might use the same
> > hardcoded path or port. When that happens we have a sporadic and
> sometimes
> > load sensitive failure. Complicating, each time one clones a repository
> on
> > a different host or file filesystem JUnit may pick up a different test
> > order, influenced by whatever readdir hands back for each package.
> >
> >
> >
> >
> > On Thu, Jun 26, 2014 at 3:25 PM, Mikhail Antonov <olorinbant@gmail.com>
> > wrote:
> >
> >> Andrew,
> >>
> >> Could you share some details - on what env. you're running the tests,
> and
> >> at which point do that fail? I'm curious because of lately I'm seeing
> >> weird
> >> failures on current master too, which do not happen on hadoop-qa -
> >>  individual tests always pass, but when running the suite tests either
> get
> >> stuck and time out (in roughly the same point), or fail with NPE or
> >> PermGen
> >> exception. I've been blaming my environment first, but may be it's
> >> something related.
> >>
> >> -Mikhail
> >>
> >>
> >>
> >>
> >> 2014-06-26 13:39 GMT-07:00 Andrew Purtell <apurtell@apache.org>:
> >>
> >> > I'm finding that repeated runs of the unit test suite at the head of
> >> branch
> >> > 0.98 intermittently fail. Individual tests do not, so this likely a
> >> lagging
> >> > shutdown, port/resource conflict, and/or zombie test issue. I am
> >> currently
> >> > bisecting commits on 0.98 branch since the last release in the hope of
> >> > pinning this down to a single change. Depending on how quickly that
> can
> >> > happen, the RC might happen on Monday or not. As things stand at the
> >> head
> >> > of the branch, I'd not +1 the RC given the release criteria I've been
> >> using
> >> > up to now.
> >>
> >
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Thanks,
Michael Antonov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message