hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Greenwood" <to...@audiencescience.com>
Subject RE: test failures in branch-3.2
Date Fri, 31 Jul 2009 20:07:26 GMT
Patrick,
Thank you for the background (and I hope you and Mahadev recover
quickly).

On a plus note, I'm finding that this morning, @work rather than @home,
the tests continue to completion. However, there are other issues that
I'll bring up on the dev list, such as a requirement to have autoconf
installed, and problems in the create-cppunit-configure task that can't
exec libtoolize, fun stuff like tha.

I need to proceed with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.

-Todd

> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org]
> Sent: Friday, July 31, 2009 11:38 AM
> To: zookeeper-user@hadoop.apache.org; Todd Greenwood
> Subject: Re: test failures in branch-3.2
> 
> Hi Todd,
> 
> Sorry for the clutter/confusion. Usually things aren't this cumbersome
;-)
> 
> In particular:
>    1 committer is on vacation
>    Mahadev's been out sick for multiple days
>    I'm sick but trying to hang in there, but def not 100%
> 
> Hudson (CI) has been offline for effectively the past 3 weeks (that
> gates all our commits) and is just now back but flaky.
> 
> 3.2 had some bugs that we are trying to address, but the afore
mentioned
> issues are slowing us down. Otw we'd have all this straightened out by
> now ....
> 
> At this point you should move this discussion to the dev list - Apache
> doesn't really like us to discuss code changes/futures here (user
list).
> On that list you'll also see the plan for upcoming releases - I
mention
> all this because we are actively working toward 3.2.1 which will
include
> the JIRAs slated for that release (I'm sure you've seen).
> 
> If you can wait a bit you might be able to avoid some pain by using
the
> upcoming 3.2.1 release. Once the patches land into that branch your
> issues will be resolved w/o you needing to manually apply patches,
etc...
> 
> 
> I did look at the files you attached - it looks fine so I'm not sure
the
> issue. The form of this test makes it harder - we are verifying that
the
> log contains sufficient information when a particular error occurs. We
> fiddle with log4j in order to do this, which means that the log you
are
> including doesn't specify the problem.
> 
> Try instrumenting this test with a try/catch around the content of the
> test method (all the code in the failing method inside a big try/catch
> is what I mean). Then print the error to std out as part of the catch.
> That should shed some light. If you could debug it a bit that would
help
> - because we aren't seeing this in our environment.
> 
> Again, sort of a moot point if you can wait a week or so...
> 
> Regards,
> 
> Patrick
> 
> Todd Greenwood wrote:
> > Inline.
> >
> >> -----Original Message-----
> >> From: Patrick Hunt [mailto:phunt@apache.org]
> >> Sent: Thursday, July 30, 2009 10:57 PM
> >> To: zookeeper-user@hadoop.apache.org
> >> Subject: Re: test failures in branch-3.2
> >>
> >> Todd Greenwood wrote:
> >>> Starting w/ branch-3.2 (no changes) I applied patches in this
order:
> >>>
> >>> 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
> > fails.
> >>> 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file
-
> >>> PortAssignment.java.
> >>>
> >>> PortAssignment.java was added by Patrick as part of
> > ZOOKEEPER-473.patch,
> >>> which is a pretty hefty patch (> 2k lines) and touches a large
> > number of
> >>> files.
> >> Hrm, those patches were probably created against the trunk. We'll
have
> >> to have separate patches for trunk and 3.2 branch on 481.
> >>
> >> If you could update the jira with this detail (481 needs two
patches,
> >> one for each branch) that would be great!
> >>
> >
> > Done.
> >
> >>> 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
> > (jvm
> >>> crashes).
> >> 473 is "special" (unique) in the sense that it changes log4j while
the
> >> the vm is running. In general though it's a pretty boring test and
> >> shouldn't be failing.
> >>
> >> Are you sure you have the right patch file? there are 2 patch files
on
> >> the JIRA for 473, make sure that you have the one from 7/16, NOT
the
> > one
> >> from 7/15. Check that the patch file, the correct one should NOT
> > contain
> >> changes to build.xml or conf/log4j* files. If this still happens
send
> > me
> >> your build.xml, conf/log4j* and QuroumPeerMainTest.java files in
email
> >> for review. I'll take a look.
> >>
> >
> >
> > I've annotated the files w/ their date while downloading:
> > 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
> > 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch
> >
> > It appears I applied the 7-16 patch, as that is the matching file
size
> > of the patch file I applied.
> >
> > If there are to be multiple patch files for multiple branches (3.2,
> > trunk, etc.) would it make sense to lable the patch files
accordingly?
> >
> > Requested files in attached tar.
> >
> > -Todd
> >
> >> Patrick
> >>
> >>
> >>> [junit] Running
> > org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> >>>     [junit] Running
> >>> org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> >>>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0
> > sec
> >>>     [junit] Test
> > org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> >>> FAILED (crashed)
> >>>
> >>> ------------
> >>> Test Log
> >>> ------------
> >>> Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> >>> Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
> >>>
> >>> Testcase: testBadPeerAddressInQuorum took 0.004 sec
> >>>     Caused an ERROR
> >>> Forked Java VM exited abnormally. Please note the time in the
report
> >>> does not reflect the time until the VM exit.
> >>> junit.framework.AssertionFailedError: Forked Java VM exited
> > abnormally.
> >>> Please note the time in the report does not reflect the time until
> > the
> >>> VM exit.
> >>>
> >>> -Todd
> >>>
> >>> -----Original Message-----
> >>> From: Patrick Hunt [mailto:phunt@apache.org]
> >>> Sent: Thursday, July 30, 2009 10:13 PM
> >>> To: zookeeper-user@hadoop.apache.org
> >>> Subject: Re: test failures in branch-3.2
> >>>
> >>> Todd Greenwood wrote:
> >>>> ....
> >>>> [Todd] Yes, I believe "address in use" was the problem w/
FLETest.
> > I
> >>>> assumed it was a timing issue w/ respect to test A not fully
> > releasing
> >>>> resources before test B started.
> >>> Might be, but actually I think it's related to this:
> >>> http://hea-www.harvard.edu/~fine/Tech/addrinuse.html
> >>>
> >>> Patrick

Mime
View raw message