hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: Policy on adding timeouts to tests
Date Wed, 16 Apr 2014 18:43:54 GMT
The timeouts are nice when trying to debug test failures on Jenkins,
because otherwise you just see something like this:

Caused by: java.lang.RuntimeException: The forked VM terminated without
saying properly goodbye. VM crash or System.exit called ?

We still see this today because some tests lack timeouts. I'd encourage
bringing back the test timeout requirement, but always setting a
conservative value (e.g. always 2+ mins). I think the debuggability
improvements are worth it, and we shouldn't need as many "raise the
timeout" JIRAs.

If someone wants to put in some additional effort, it'd be even better to
do what HBase did and categorize our tests into "fast" and "slow" maven
profiles. This would give us a nice way of running the fast subset as a
smoke. Right now, I doubt many devs run the test suite locally since it
takes multiple hours.

Best,
Andrew

On Wed, Apr 16, 2014 at 10:51 AM, Tsuyoshi OZAWA
<ozawa.tsuyoshi@gmail.com>wrote:

> Hi Karthik,
>
> Some tests with servers like MiniCluster or ZK can never end because
> of unexpected busy loop or something if the tests don't have timeouts.
> It can blocks the other jobs of Jenkins server. Therefore, IMHO, we
> should add timeouts when we write tests with them.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Apr 16, 2014 at 6:11 PM, Steve Loughran <stevel@hortonworks.com>
> wrote:
> > There's a JIRA somewhere that 's never gone in, to add a timeout rule to
> a
> > base class; this rule gets picked up in that test class and all children
> to
> > specify the timeout
> >
> >   @Rule
> >   public final Timeout testTimeout = new Timeout(TEST_TIMEOUT);
> >
> >
> >    1. If we are going to have a timeout everywhere, it should be
> >    configurable to different delays. For mavn, that's SystemProperties
> being
> >    passed down and extracted.
> >    2. We don't want that in every @test method
> >    3. so... we should have a AbstractYarnTest, AbstractMapReduce test,
> &c,
> >    each picking up the timeout option for their part of the suite
> >    4. then cut out all the other timeouts.
> >    5. and finally document this somewhere.
> >    6. Object store tests need extra-long timeouts, execution time for
> multi
> >    GB uploads to S3 and openstack object stores are a function of your
> upload
> >    bandwidth, not machine speed
> >
> > -steve
> >
> >
> >
> > On 15 April 2014 21:20, Karthik Kambatla <kasha@cloudera.com> wrote:
> >
> >> - hwx-hdfs-dev
> >> + hdfs-dev
> >>
> >> Agree with all the points Chris makes.
> >>
> >> I asked this question in the context of a fix that bumps up the timeout
> to
> >> make the test pass on slower machines. If the timeout is not central to
> the
> >> test, is the recommended approach to get rid of the timeout?
> >>
> >>
> >>
> >> On Tue, Apr 15, 2014 at 11:37 AM, Chris Nauroth <
> cnauroth@hortonworks.com
> >> >wrote:
> >>
> >> > +common-dev, hdfs-dev
> >> >
> >> > My understanding of the current situation is that we had a period
> where
> >> we
> >> > tried to enforce adding timeouts on all new tests in patches, but it
> >> caused
> >> > trouble, and now we're back to not requiring it.  Jenkins test-patch
> >> isn't
> >> > checking for it anymore.
> >> >
> >> > I don't think patches are getting rejected for using timeouts though.
> >> >
> >> > The difficulty is that execution time is quite sensitive to the build
> >> > environment.  (Consider top-of-the-line server hardware used in build
> >> > infrastructure vs. a dev running a VirtualBox VM with 1 dedicated
> CPU, 2
> >> GB
> >> > RAM and slow virtualized disk.)  When we were enforcing timeouts, it
> was
> >> > quite common to see follow-up patches tuning up the timeout settings
> to
> >> > make tests work reliably in a greater variety of environments.  At
> that
> >> > point, the benefit of using the timeout becomes questionable, because
> now
> >> > the fast machine is running with the longer timeout too.
> >> >
> >> > Chris Nauroth
> >> > Hortonworks
> >> > http://hortonworks.com/
> >> >
> >> >
> >> >
> >> > On Mon, Apr 14, 2014 at 9:41 AM, Karthik Kambatla <kasha@cloudera.com
> >> > >wrote:
> >> >
> >> > > Hi folks
> >> > >
> >> > > Just wanted to check what our policy for adding timeouts to tests
> is.
> >> Do
> >> > we
> >> > > encourage/discourage using timeouts for tests? If we discourage
> using
> >> > > timeouts for tests in general, are we okay with adding timeouts for
> a
> >> few
> >> > > tests where we explicitly want the test to fail if it takes longer
> >> than a
> >> > > particular amount of time?
> >> > >
> >> > > Thanks
> >> > > Karthik
> >> > >
> >> >
> >> > --
> >> > CONFIDENTIALITY NOTICE
> >> > NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >> > which it is addressed and may contain information that is
> confidential,
> >> > privileged and exempt from disclosure under applicable law. If the
> reader
> >> > of this message is not the intended recipient, you are hereby notified
> >> that
> >> > any printing, copying, dissemination, distribution, disclosure or
> >> > forwarding of this communication is strictly prohibited. If you have
> >> > received this communication in error, please contact the sender
> >> immediately
> >> > and delete it from your system. Thank You.
> >> >
> >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
>
>
> --
> - Tsuyoshi
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message