river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: Help needed with concurrency bug
Date Sun, 25 Dec 2011 21:54:55 GMT
Dan Creswell wrote:
>>> Where "bug" is potentially a swallowed exception.
>>> Reggie for the most part holds a writeLock for any significant
>>> invocation from the client and events are dispatched via a
>>> TaskManager. Seems like an unlikely source for problems. Am I right in
>>> assuming the test itself is single-threaded? Sure looks like it is?
>>> That would suggest to me a problem in the remote comms layer if
>>> anywhere. My biggest worry would be that your permissions work is
>>> leading to a security exception or similar and it's just being
>>> swallowed.
>>> Take that worry and combine it with the fact that the last failing
>>> test does throw an exception and doesn't touch remote services or
>>> indeed the JERI layer, I'd say this is the test to look at first and
>>> perhaps we're not looking at an additional bug.
>> Nope, it's a concurrency bug, TaskManager would create task threads for the
>> events, the bug goes away when I activate security debug, there are no
>> permission failures.
> Sure, it's a concurrency bug I just happen to think all the symptoms
> point at the new code being the culprit. Activating all that security
> debug reduces interleaving and such reducing the chance of e.g. a race
> condition occurring.

Well, it's not livelock, it's not deadlock, although in one test I can 
induce deadlock (no progress, no cpu load) with the debugger, so it 
isn't a loop, or CAS, it's some kind of synchronization deadlock.

Considering I have trouble printing the ProtectionDomain in the 
debugger, could this be a stale reference, or is there no relation?

> And the fact that there are no lost events unless your code is present
> can cut both ways but the simplest explanation would be a bug in your
> code not a bug in TaskManager which has virtually nothing to do with
> security.
If there's a concurrency issue in my code, it's in DynamicPolicyProvider 
or one of the classes it uses, not ConcurrentPolicyFile, since this bug 
still occurs when I replace it with Sun's implementation.


View raw message