zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: test failures in branch-3.2
Date Fri, 31 Jul 2009 04:49:50 GMT
well try running these two tests individually and see if they always 
fail or just occassionally. that will be a good start (and the env detail).


Todd Greenwood wrote:
> No edits to conf/log4j.properties.
> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org] 
> Sent: Thursday, July 30, 2009 9:25 PM
> To: Patrick Hunt
> Cc: zookeeper-user@hadoop.apache.org
> Subject: Re: test failures in branch-3.2
> btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
> conf/log4j.properties, now that I think of it perhaps not such a good 
> idea :-)
> If you edited cong/log4j.properties it may be causing the test to fail, 
> did you do this? (if you run the test by itself using -Dtestcase does it
> always fail?)
> I've entered a jira to address this:
> https://issues.apache.org/jira/browse/ZOOKEEPER-492
> Patrick
> Patrick Hunt wrote:
>> Todd Greenwood wrote:
>>> The build succeeds, but not the all of the tests. In previous test
> runs,
>>> I noticed an error in org.apache.zookeeper.test.FLETest. It was not
> able
>>> to bind to a port or something. Now, after a machine reboot, I'm
> getting
>>> different failures. 
>> "address in use"? That's a problem in the test framework pre-3.3. In
> 3.3 
>> (current svn trunk) I fixed it but it's not in 3.2.x. This is a
> problem 
>> with the test framework though and not a real problem, it shows up 
>> occasionally (depends on timing).
>>> branch-3.2 $ ant test
>>> [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>>> FAILED (crashed)
>>> [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED
>>> Test logs for these two tests attached.
>> This is unusual though - looking at the log it seems that the JVM
> itself 
>> crashed for the QPMainTest! for HQT we are seeing:
>> junit.framework.AssertionFailedError: Threads didn't join
>> which Flavio mentioned to me once is possible to happen but not a real
>> problem (he can elaborate).
>> What version of java are you using? OS, other environment that might
> be 
>> interesting? (vm? etc...) You might try looking at the jvm crash dump 
>> file (I think it's in /tmp)
>> If you run each of these two tests individually do they run? example:
>> ant -Dtestcase=FLENewEpochTest test-core-java
>>> My goal here is to get to a known state (all tests succeeding or have
>>> workarounds for the failures). Following that, I plan to apply the
>>> patches Flavio recommended for a WAN deploy (479 and 481). After I
>>> verify that the tests continue to run, I'll package this up and
> deploy
>>> it to our WAN for testing. 
>> Sounds like a good plan.
>>> So, are these known issues? Do the tests normally run en masse, or do
>>> some of the tests hold on to resources and prevent other tests from
>>> passing?
>> Typically they do run to completion, but occasionally on my machine 
>> (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
>> random failure due to address in use, or the same "didn't join" that
> you 
>> saw. Usually I see this if I'm multitasking (vs just letting the tests
>> run w/o using the box). As I said this is addressed in 3.3 (address 
>> reuse at the very least, and I haven't see the other issues).
>> Patrick

View raw message