hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: test failures in branch-3.2
Date Fri, 31 Jul 2009 18:37:46 GMT
Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-)

In particular:
   1 committer is on vacation
   Mahadev's been out sick for multiple days
   I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that 
gates all our commits) and is just now back but flaky.

3.2 had some bugs that we are trying to address, but the afore mentioned 
issues are slowing us down. Otw we'd have all this straightened out by 
now ....

At this point you should move this discussion to the dev list - Apache 
doesn't really like us to discuss code changes/futures here (user list). 
On that list you'll also see the plan for upcoming releases - I mention 
all this because we are actively working toward 3.2.1 which will include 
the JIRAs slated for that release (I'm sure you've seen).

If you can wait a bit you might be able to avoid some pain by using the 
upcoming 3.2.1 release. Once the patches land into that branch your 
issues will be resolved w/o you needing to manually apply patches, etc...


I did look at the files you attached - it looks fine so I'm not sure the 
issue. The form of this test makes it harder - we are verifying that the 
log contains sufficient information when a particular error occurs. We 
fiddle with log4j in order to do this, which means that the log you are 
including doesn't specify the problem.

Try instrumenting this test with a try/catch around the content of the 
test method (all the code in the failing method inside a big try/catch 
is what I mean). Then print the error to std out as part of the catch. 
That should shed some light. If you could debug it a bit that would help 
- because we aren't seeing this in our environment.

Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:
> Inline.
> 
>> -----Original Message-----
>> From: Patrick Hunt [mailto:phunt@apache.org]
>> Sent: Thursday, July 30, 2009 10:57 PM
>> To: zookeeper-user@hadoop.apache.org
>> Subject: Re: test failures in branch-3.2
>>
>> Todd Greenwood wrote:
>>> Starting w/ branch-3.2 (no changes) I applied patches in this order:
>>>
>>> 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
> fails.
>>> 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
>>> PortAssignment.java.
>>>
>>> PortAssignment.java was added by Patrick as part of
> ZOOKEEPER-473.patch,
>>> which is a pretty hefty patch (> 2k lines) and touches a large
> number of
>>> files.
>> Hrm, those patches were probably created against the trunk. We'll have
>> to have separate patches for trunk and 3.2 branch on 481.
>>
>> If you could update the jira with this detail (481 needs two patches,
>> one for each branch) that would be great!
>>
> 
> Done.
> 
>>> 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
> (jvm
>>> crashes).
>> 473 is "special" (unique) in the sense that it changes log4j while the
>> the vm is running. In general though it's a pretty boring test and
>> shouldn't be failing.
>>
>> Are you sure you have the right patch file? there are 2 patch files on
>> the JIRA for 473, make sure that you have the one from 7/16, NOT the
> one
>> from 7/15. Check that the patch file, the correct one should NOT
> contain
>> changes to build.xml or conf/log4j* files. If this still happens send
> me
>> your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email
>> for review. I'll take a look.
>>
> 
> 
> I've annotated the files w/ their date while downloading:
> 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
> 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch
> 
> It appears I applied the 7-16 patch, as that is the matching file size
> of the patch file I applied.
> 
> If there are to be multiple patch files for multiple branches (3.2,
> trunk, etc.) would it make sense to lable the patch files accordingly?
> 
> Requested files in attached tar.
> 
> -Todd
> 
>> Patrick
>>
>>
>>> [junit] Running
> org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>>>     [junit] Running
>>> org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>>>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0
> sec
>>>     [junit] Test
> org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>>> FAILED (crashed)
>>>
>>> ------------
>>> Test Log
>>> ------------
>>> Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>>> Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>>>
>>> Testcase: testBadPeerAddressInQuorum took 0.004 sec
>>>     Caused an ERROR
>>> Forked Java VM exited abnormally. Please note the time in the report
>>> does not reflect the time until the VM exit.
>>> junit.framework.AssertionFailedError: Forked Java VM exited
> abnormally.
>>> Please note the time in the report does not reflect the time until
> the
>>> VM exit.
>>>
>>> -Todd
>>>
>>> -----Original Message-----
>>> From: Patrick Hunt [mailto:phunt@apache.org]
>>> Sent: Thursday, July 30, 2009 10:13 PM
>>> To: zookeeper-user@hadoop.apache.org
>>> Subject: Re: test failures in branch-3.2
>>>
>>> Todd Greenwood wrote:
>>>> ....
>>>> [Todd] Yes, I believe "address in use" was the problem w/ FLETest.
> I
>>>> assumed it was a timing issue w/ respect to test A not fully
> releasing
>>>> resources before test B started.
>>> Might be, but actually I think it's related to this:
>>> http://hea-www.harvard.edu/~fine/Tech/addrinuse.html
>>>
>>> Patrick

Mime
View raw message