hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: maintaining stable HBase build
Date Mon, 26 Sep 2011 19:57:41 GMT
>From TRUNK build 2259:

Failed tests:   queueFailover(org.apache.
hadoop.hbase.replication.TestReplication): Waited too much time for
queueFailover replication

I know Doug's change wouldn't have caused the above failure.

FYI

On Mon, Sep 26, 2011 at 10:45 AM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> I was thinking more along the lines:
> Either fix the test to not flap, or remove it.
>
> The first task would be to identify all tests that frequently show
> non-deterministic results.
>
> ------------------------------
> *From:* Ted Yu <yuzhihong@gmail.com>
> *To:* dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
> *Sent:* Monday, September 26, 2011 2:08 AM
>
> *Subject:* Re: maintaining stable HBase build
>
> Below is a simple script to repeatedly run a unit test.
> I suggest using it or similar script on the new unit test(s) in future
> patches.
>
> #!/bin/bash
> # script to run test repeatedly
> # usage: ./runtest.sh <name of test> <number of repetitions>
> #
> for ((  i = 1 ;  i <= $2; i++  ))
> do
>   nice -10 mvn test -Dtest=$1
>   if [ $? -ne 0 ]; then
>     echo "$1 failed"
>     exit 1
>   fi
> done
>
> Thanks
>
> On Sun, Sep 25, 2011 at 2:27 PM, lars hofhansl <lhofhansl@yahoo.com>wrote:
>
> At Salesforce we call these "flappers" and they are considered almost worse
> than failing tests,
> as they add noise to a test run without adding confidence.
> At test that fails once in - say - 10 runs is worthless.
>
>
>
> ________________________________
> From: Ted Yu <yuzhihong@gmail.com>
> To: dev@hbase.apache.org
> Sent: Sunday, September 25, 2011 1:41 PM
> Subject: Re: maintaining stable HBase build
>
> As of 1:38 PST Sunday, the three builds all passed.
>
> I think we have some tests that exhibit in-deterministic behavior.
>
> I suggest committers interleave patch submissions by 2 hour span so that we
> can more easily identify patch(es) that break the build.
>
> Thanks
>
> On Sun, Sep 25, 2011 at 7:45 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > I wrote a short blog:
> > http://zhihongyu.blogspot.com/2011/09/streamlining-patch-submission.html
> >
> > It is geared towards contributors.
> >
> > Cheers
> >
> >
> > On Sat, Sep 24, 2011 at 9:16 AM, Ramakrishna S Vasudevan 00902313 <
> > ramakrishnas@huawei.com> wrote:
> >
> >> Hi
> >>
> >> Ted, I agree with you.  Pasting the testcase results in JIRA is also
> fine,
> >> mainly when there are some testcase failures when we run locally but if
> we
> >> feel it is not due to the fix we have added we can mention that also.  I
> >> think rather than in a windows machine its better to run in linux box.
> >>
> >> +1 for your suggestion Ted.
> >>
> >> Can we add the feature like in HDFS when we submit patch automatically
> the
> >> Jenkin's run the testcases?
> >>
> >> Atleast till this is done I go with your suggestion.
> >>
> >> Regards
> >> Ram
> >>
> >> ----- Original Message -----
> >> From: Ted Yu <yuzhihong@gmail.com>
> >> Date: Saturday, September 24, 2011 4:22 pm
> >> Subject: maintaining stable HBase build
> >> To: dev@hbase.apache.org
> >>
> >> > Hi,
> >> > I want to bring the importance of maintaining stable HBase build to
> >> > ourattention.
> >> > A stable HBase build is important, not just for the next release
> >> > but also
> >> > for authors of the pending patches to verify the correctness of
> >> > their work.
> >> >
> >> > At some time on Thursday (Sept 22nd) 0.90, 0.92 and TRUNK builds
> >> > were all
> >> > blue. Now they're all red.
> >> >
> >> > I don't mind fixing Jenkins build. But if we collectively adopt
> >> > some good
> >> > practice, it would be easier to achieve the goal of having stable
> >> > builds.
> >> > For contributors, I understand that it takes so much time to run
> >> > whole test
> >> > suite that he/she may not have the luxury of doing this - Apache
> >> > Jenkinswouldn't do it when you press Submit Patch button.
> >> > If this is the case (let's call it scenario A), please use Eclipse
> >> > (or other
> >> > tool) to identify tests that exercise the classes/methods in your
> >> > patch and
> >> > run them. Also clearly state what tests you ran in the JIRA.
> >> >
> >> > If you have a Linux box where you can run whole test suite, it
> >> > would be nice
> >> > to utilize such resource and run whole suite. Then please state
> >> > this fact on
> >> > the JIRA as well.
> >> > Considering Todd's suggestion of holding off commit for 24 hours
> >> > after code
> >> > review, 2 hour test run isn't that long.
> >> >
> >> > Sometimes you may see the following (from 0.92 build 18):
> >> >
> >> > Tests run: 1004, Failures: 0, Errors: 0, Skipped: 21
> >> >
> >> > [INFO] -------------------------------------------------------------
> >> > -----------
> >> > [INFO] BUILD FAILURE
> >> > [INFO] -------------------------------------------------------------
> >> > -----------
> >> > [INFO] Total time: 1:51:41.797s
> >> >
> >> > You should examine the test summary above these lines and find out
> >> > which test(s) hung. For this case it was TestMasterFailover:
> >> >
> >> > Running org.apache.hadoop.hbase.master.TestMasterFailover
> >> > Running
> >> >
> org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTableTests
> >> run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.265 sec
> >> >
> >> > I think a script should be developed that parses test output and
> >> > identify hanging test(s).
> >> >
> >> > For scenario A, I hope committer would run test suite.
> >> > The net effect would be a statement on the JIRA, saying all tests
> >> > passed.
> >> > Your comments/suggestions are welcome.
> >> >
> >>
> >
> >
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message