hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: test failures in branch-3.2
Date Fri, 31 Jul 2009 04:25:11 GMT
btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)

If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it 
always fail?)

I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:
> Todd Greenwood wrote:
>> The build succeeds, but not the all of the tests. In previous test runs,
>> I noticed an error in org.apache.zookeeper.test.FLETest. It was not able
>> to bind to a port or something. Now, after a machine reboot, I'm getting
>> different failures. 
> 
> "address in use"? That's a problem in the test framework pre-3.3. In 3.3 
> (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
> with the test framework though and not a real problem, it shows up 
> occasionally (depends on timing).
> 
>> branch-3.2 $ ant test
>>
>> [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>> FAILED (crashed)
>> [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED
>>
>> Test logs for these two tests attached.
> 
> This is unusual though - looking at the log it seems that the JVM itself 
> crashed for the QPMainTest! for HQT we are seeing:
> 
> junit.framework.AssertionFailedError: Threads didn't join
> 
> which Flavio mentioned to me once is possible to happen but not a real 
> problem (he can elaborate).
> 
> What version of java are you using? OS, other environment that might be 
> interesting? (vm? etc...) You might try looking at the jvm crash dump 
> file (I think it's in /tmp)
> 
> If you run each of these two tests individually do they run? example:
> ant -Dtestcase=FLENewEpochTest test-core-java
> 
>> My goal here is to get to a known state (all tests succeeding or have
>> workarounds for the failures). Following that, I plan to apply the
>> patches Flavio recommended for a WAN deploy (479 and 481). After I
>> verify that the tests continue to run, I'll package this up and deploy
>> it to our WAN for testing. 
> 
> Sounds like a good plan.
> 
>> So, are these known issues? Do the tests normally run en masse, or do
>> some of the tests hold on to resources and prevent other tests from
>> passing?
> 
> Typically they do run to completion, but occasionally on my machine 
> (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
> random failure due to address in use, or the same "didn't join" that you 
> saw. Usually I see this if I'm multitasking (vs just letting the tests 
> run w/o using the box). As I said this is addressed in 3.3 (address 
> reuse at the very least, and I haven't see the other issues).
> 
> Patrick
> 
> 

Mime
View raw message