river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Thompson <br...@systap.com>
Subject Re: test failure repeatability
Date Mon, 01 Apr 2013 12:15:30 GMT
Peter,

As I said, I am happy to take on a few classes and see if I can find some
problems or discuss possible concurrency failures.  I am pretty good at
recognizing concurrency problems, at least within the range of coding
styles that I am familiar with.  But I do not know the river internals.
This will of necessity limit my ability to recognize problems with respect
to the overall architecture.

Bryan

On 4/1/13 8:10 AM, "Peter" <jini@zeus.net.au> wrote:

>The big problem is there are less concurrency bugs in the present branch
>and there are definitely more in the previous release, they just haven't
>presented in test in previous releases, although they have in deployment.
>
>The issue appears to be with javaspaces / lease / transactions.
>
>It's also a possibility that there's just a timing issue with the tests.
>
>I'd like to fix the tests before releasing, it's unfortunate the last fix
>to jeri seemed to expose these new test failures.  It's a little like
>fixing rust in an old car, the more you remove, the more you find.
>
>What I really need is more help, I don't work on this full time it takes
>longer than a team woking on commercial software.
>
>Peter.
>----- Original message -----
>> I am uncomfortable with a release that has known concurrency problems.
>>I
>> am also uncomfortable that the custom Levels serialization change by
>> Oracle [1] has broken the current release for new JVMs.  I would like to
>> see a minor release which fixes that serialization problem and a
>>candidate
>> release which gives people a chance to discovery concurrent issues
>>without
>> risking a release that is known to be unstable.
>>
>> I'm happy to review a few of the classes with known concurrency problems
>> to see if I can help nail some of these bugs.  Since I do not know the
>> river internals, I would only be able to spot concurrency problems that
>> exist within a class.  I am not in a good position to comment on
>> concurrency problems that might arise through the interactions among
>> classes.
>>
>> Bryan
>>
>> [1] https://issues.apache.org/jira/browse/RIVER-416
>>
>> On 4/1/13 6:14 AM, "Peter Firmstone" <jini@zeus.net.au> wrote:
>>
>> > The attachments will be removed from the list, so I've cc'd you,
>>anyone
>> > who's interested, let me know I can forward the attachments.  They can
>> > be opened with jvisualvm.
>> >
>> > The profiling isn't perfect, the test runs for about 8.5 minutes, so
>> > hotspot should have kicked in relatively early in both test runs.
>> >
>> > I guess a significant problem is; the more I remove performance
>> > impedances, like unnecessary DNS calls, the faster multithreading and
>> > context switching gets.
>> >
>> > Not only did the old policy providers create contention, but it was
>> > slower for single threaded performance (I'll have to run the previous
>> > release branch for comparison when I get some time).
>> >
>> > The URIGrant.implies call is now down to .228 ms per invocation, down
>> > from 1.68 ms per invocation this week, which was already quite good
>>(on
>> > old UltraSparcII hardware), during stress tests this method is called
>> > almost 40,000 times.
>> >
>> > In comparison the old policy provider which required a DNS call (every
>> > time CodeSource.implies is called, functionality now replaced by
>> > URIGrant.implies), the old policy provider also cached all
>>Permission's
>> > in highly contended PermissionCollection's, which during network calls
>> > invoked SocketPermission.implies, possibly for every SocketPermission
>>in
>> > the PermissionCollection, DNS is also consulted by
>> > SocketPermission.implies, while synchronized, ouch!
>> >
>> > URIGrant.implies is non blocking, that's right zero contention.
>> > SocketPermission.implies DNS calls can be avoided in most cases if
>> > PermissionComparator finds an exact match or wild card.
>> >
>> > This is without the CombinerSecurityManager, which improves security
>> > performance by a factor of 10 (SocketPermission is only checked once
>>for
>> > each AccessControlContext).
>> >
>> > PreferredClassProvider no longer creates unecessary DNS calls, and
>> > neither does SecureClassLoader,  URLClassLoader or
>>PreferredClassLoader.
>> >
>> > So the good news is the next release will feel much faster, the bad
>>news
>> > is that existing concurrency bugs that previously didn't appear during
>> > test runs, but likely to manifest during production are now occurring
>> > during testing.  There's more good news, we fixed a number of
>> > concurrency bugs since the last release too, I guess I have to draw
>>the
>> > line somewhere and cut a release.
>> >
>> > DNS calls haven't been completely eliminated as some are still
>> > necessary, but a multitude of unnecessary DNS calls have been
>>eliminated.
>> >
>> > Regards,
>> >
>> > Peter.
>> >
>> > Peter Firmstone wrote:
>> > > They've passed more consistently in the past, they're either
>> > > concurrency bugs or network timing related, it would be nice to at
>> > > least determine if it's the former or latter.
>> > >
>> > > Cheers,
>> > >
>> > > Peter.
>> > >
>> > > Tom Hobbs wrote:
>> > > > Are these all new failures, i.e. were they working before?  Or
>>are they
>> > > > "new" failures in that the test categories have only recently been
>> > > > reactivated and the failures discovered?
>> > > >
>> > > > Are they a big enough blocker to stop a release?
>> > > >
>> > > >
>> > > > On Sun, Mar 31, 2013 at 11:31 PM, Peter Firmstone
>><jini@zeus.net.au>
>> > > > wrote:
>> > > >
>> > > >
>> > > > > The following test fails 30 times in a run of 130 tests:
>> > > > >
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > > [java]  #            of          tests    passed  =         
  0
>> > > > > [java]  #            of          tests    passed  =         
  1
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >                100
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Buildfile: build.xml
>> > > > >
>> > > > > qa.run-tests:
>> > > > >
>> > > > > james-brown:
>> > > > >    [delete] Deleting directory /opt/src/River_Fixed/**
>> > > > > peterConcurrentPolicy/qa/soul
>> > > > >      [mkdir] Created dir: /opt/src/River_Fixed/**
>> > > > > peterConcurrentPolicy/qa/soul
>> > > > >      [touch] Creating
>> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/qa/soul/*
>> > > > > *soul.201303312239034808
>> > > > >
>> > > > > run-tests:
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java] CONFIGURATION FILE:
>> > > > >        [java]
>> > > > >        [java]
>>/opt/src/River_Fixed/**peterConcurrentPolicy/qa/src/**
>> > > > > com/sun/jini/test/resources/**qaHarness.prop
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java] SETTING UP THE TEST LIST:
>> > > > >        [java]
>> > > > >        [java]      Adding test:
>> > > > > com/sun/jini/test/spec/**javaspace/conformance/
>> > > > > **snapshot/**SnapshotExpirationNotifyTest.**td
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java] GENERAL HARNESS CONFIGURATION INFORMATION:
>> > > > >        [java]
>> > > > >        [java]      Date started:
>> > > > >        [java]            Sun Mar 31 22:39:37 EST 2013
>> > > > >        [java]      Installation directory of the JSK:
>> > > > >        [java]
>>com.sun.jini.jsk.home=/opt/**src/River_Fixed/**
>> > > > > peterConcurrentPolicy
>> > > > >        [java]      Installation directory of the harness:
>> > > > >        [java]
>>com.sun.jini.qa.home=/opt/src/**River_Fixed/**
>> > > > > peterConcurrentPolicy/qa
>> > > > >        [java]      Categories being tested:
>> > > > >        [java]            categories=No Categories
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java] ENVIRONMENT PROPERTIES:
>> > > > >        [java]
>> > > > >        [java]      JVM information:
>> > > > >        [java]            Java HotSpot(TM) Server VM, 20.5-b03,
>>32 bit VM mode
>> > > > >        [java]            Sun Microsystems Inc.
>> > > > >        [java]      OS information:
>> > > > >        [java]            SunOS, 5.10, sparc
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java] STARTING TO RUN THE TESTS
>> > > > >        [java]
>> > > > >        [java]
>> > > > >        [java] Running
>>com/sun/jini/test/spec/**javaspace/conformance/**
>> > > > > snapshot/**SnapshotExpirationNotifyTest.**td
>> > > > >        [java] Time is Sun Mar 31 22:39:38 EST 2013
>> > > > >        [java] Starting test in separate process with command:
>> > > > >        [java] /usr/jdk/jdk1.6.0_30/jre/bin/**java
>> > > > >
>> > > > >
>>-Djava.security.manager=org.**apache.river.api.security.**CombinerSecur
>> > > > > ityManager
>> > > > >
>> > > > > -Djava.security.policy=file:/**opt/src/River_Fixed/**
>> > > > > peterConcurrentPolicy/qa/**harness/policy/defaulttest.**policy
>> > > > >
>> > > > >
>>-Djava.rmi.server.codebase=htt**p://bluto:9082/qa1-javaspace-**dl.jar<h
>> > > > > ttp://bluto:9082/qa1-javaspace-dl.jar>-cp
>> > > > > /opt/src/River_Fixed/
>> > > > >
>> > > > >
>>**peterConcurrentPolicy/qa/lib/**jiniharness.jar:/opt/src/**River_Fixed
>> > > > > /**
>> > > > >
>> > > > > peterConcurrentPolicy/qa/lib/**jinitests.jar:/opt/src/River_**
>> > > > >
>> > > > >
>>Fixed/peterConcurrentPolicy/**lib/jsk-platform.jar:/opt/src/**River_Fix
>> > > > > ed/
>> > > > >
>> > > > >
>>**peterConcurrentPolicy/lib/jsk-**lib.jar:/opt/src/River_Fixed/**
>> > > > >
>> > > > >
>>peterConcurrentPolicy/lib/**high-scale-lib.jar:/opt/src/**River_Fixed/*
>> > > > > *
>> > > > >
>> > > > > peterConcurrentPolicy/lib/**custard-apple-1.0.2.jar -ea -esa
>>-client
>> > > > > -Djava.ext.dirs=/usr/jdk/jdk1.**6.0_30/jre/lib/ext:/usr/jdk/**
>> > > > >
>> > > > >
>>packages/lib/ext:/opt/src/**River_Fixed/**peterConcurrentPolicy/qa/lib-
>> > > > > **
>> > > > >
>> > > > > ext:/opt/src/River_Fixed/**peterConcurrentPolicy/lib-ext
>> > > > > -Dcom.sun.jini.jsk.port=9080 -Dcom.sun.jini.qa.port=9081
>> > > > >
>>-Dcom.sun.jini.jsk.home=/opt/**src/River_Fixed/**peterConcurrentPolicy
>> > > > >
>> > > > >
>>-Dcom.sun.jini.qa.home=/opt/**src/River_Fixed/**peterConcurrentPolicy/q
>> > > > > a
>> > > > >
>> > > > > -Dcom.sun.jini.qa.harness.**harnessJar=/opt/src/River_**
>> > > > > Fixed/peterConcurrentPolicy/**qa/lib/jiniharness.jar
>> > > > > -Dcom.sun.jini.qa.harness.**testJar=/opt/src/River_Fixed/**
>> > > > > peterConcurrentPolicy/qa/lib/**jinitests.jar
>> > > > > -Dcom.sun.jini.qa.harness.**runjiniserver=true
>> > > > > -Dcom.sun.jini.qa.harness.**runkitserver=true
>> > > > > -Djava.security.properties=*
>> > > > > *file:/opt/src/River_Fixed/**peterConcurrentPolicy/qa/**
>> > > > > harness/trust/dynamic-policy.**properties
>> > > > > -Dcom.sun.jini.qa.harness.**testhosts=
>> > > > >
>>-Djava.util.logging.config.**file=/home/peter/logging.**properties
>> > > > >
>> > > > >
>>-Dcom.sun.jini.test.home=/opt/**src/River_Fixed/**peterConcurrentPolicy
>> > > > > /qa
>> > > > >
>> > > > > -Dcom.sun.jini.test.port=9082 -Dcom.sun.jini.qa.harness.**
>> > > > > policies=file:/opt/src/River_**Fixed/peterConcurrentPolicy/**
>> > > > > qa/src/com/sun/jini/test/**resources/jinitest.policy
>> > > > > -Djava.ext.dirs=/usr/jdk/jdk1.**6.0_30/jre/lib/ext:/usr/jdk/**
>> > > > >
>> > > > >
>>packages/lib/ext:/opt/src/**River_Fixed/**peterConcurrentPolicy/qa/lib-
>> > > > > **
>> > > > >
>> > > > > ext:/opt/src/River_Fixed/**peterConcurrentPolicy/lib-ext
>> > > > > com.sun.jini.qa.harness.**MasterTest com/sun/jini/test/spec/**
>> > > > >
>>javaspace/conformance/**snapshot/**SnapshotExpirationNotifyTest.**td
>> > > > >        [java] com.sun.jini.qa.harness.**TestException: Not all
>> > > > > listeners've
>> > > > > got expected number of events.
>> > > > >        [java]        at
>>com.sun.jini.test.spec.**javaspace.conformance.**
>> > > > > snapshot.**SnapshotExpirationNotifyTest.**run(**
>> > > > > SnapshotExpirationNotifyTest.**java:370)
>> > > > >        [java]        at
>> > > > > com.sun.jini.qa.harness.**MasterTest.doTest(MasterTest.*
>> > > > > *java:256)
>> > > > >        [java]        at
>> > > > > com.sun.jini.qa.harness.**MasterTest.main(MasterTest.**
>> > > > > java:144)
>> > > > >        [java]
>> > > > >        [java] TIME: 10:42:54 PM
>> > > > >        [java]
>> > > > >        [java] Test process was destroyed and returned code 1
>> > > > >        [java]
>> > > > > com/sun/jini/test/spec/**javaspace/conformance/**snapshot/**
>> > > > > SnapshotExpirationNotifyTest.**td
>> > > > >        [java] Test Failed: Test Failed:
>> > > > > com.sun.jini.qa.harness.**TestException:
>> > > > > Not all listeners've got expected number of events.
>> > > > >        [java]
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java]
>> > > > >        [java] SUMMARY ==============================**===
>> > > > >        [java]
>> > > > >        [java]
>> > > > > com/sun/jini/test/spec/**javaspace/conformance/**snapshot/**
>> > > > > SnapshotExpirationNotifyTest.**td
>> > > > >        [java] Test Failed: Test Failed:
>> > > > > com.sun.jini.qa.harness.**TestException:
>> > > > > Not all listeners've got expected number of events.
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java]
>> > > > >        [java] # of tests started    = 1
>> > > > >        [java] # of tests completed = 1
>> > > > >        [java] # of tests passed      = 0
>> > > > >        [java] # of tests failed      = 1
>> > > > >        [java]
>> > > > >        [java] ------------------------------**-----------
>> > > > >        [java]
>> > > > >        [java]      Date finished:
>> > > > >        [java]            Sun Mar 31 22:42:59 EST 2013
>> > > > >        [java]      Time elapsed:
>> > > > >        [java]            201 seconds
>> > > > >        [java]
>> > > > >        [java] Java Result: 1
>> > > > >
>> > > > > collect-result:
>> > > > >
>> > > > > BUILD FAILED
>> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/build.**xml:2105:
>>The
>> > > > > following error occurred while executing this line:
>> > > > > /opt/src/River_Fixed/**peterConcurrentPolicy/qa/**build.xml:357:
>> > > > > condition satisfied
>> > > > >
>> > > > > Total time: 3 minutes 30 seconds
>> > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>>
>


Mime
View raw message