river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: The importance of safe publication
Date Wed, 05 Jun 2013 08:18:43 GMT


On 5/06/2013 3:21 AM, Gregg Wonderly wrote:
> Did you use  a concurrent Set implementation?

On this occassion I used ConcurrentSkipListSet, however a synchronized 
HashSet might be more appropriate.  The HashSet would be faster, but 
synchronized, while the ConcurrentSkipListSet would take longer but is 
non blocking.  When it triggered the bug, I left it with the 
ConcurrentSkipListSet.   Now the bug's solved, I'll probably leave it as 
it is, but it didn't appear to be creating any issues with Hashtable and 
Vector, I almost didn't replace it, but now I'm glad I did.

The concurrent code is actually simpler to understand and easier to read 
on this occassion.

> The "synchronized" nature of "Hashtable" and "Vector" will, 
> unfortunately, "fix" a number of concurrency issues by causing 
> constant cache line updates to occur.  This of course is a huge impact 
> to non-concurrent code, and hence "HashMap", "HashSet" and "ArrayList" 
> make things go a lot faster there.
>
> Perhaps your change in classes inadvertently removed synchronization 
> and thus Happens Before which was keeping the JIT out of the mix?

Quite possibly, although I'd seen the failure at least once previously 
on Arm, but assumed it was fixed because it hadn't reappeared.  I guess 
it was still there, just lurking.

>
> Gregg Wonderly
>
> On 6/3/2013 5:13 PM, Peter Firmstone wrote:
>> Found a beaut bug, this time it relates to 
>> com.sun.jini.outrigger.EntryRep, this is what I think's occurring on 
>> the client side.
>>
>> During construction arrays were created, written to volatile 
>> variables, then populated with values.
>>
>> Now EntryRep uses default serialization, it isn't synchronized if 
>> marshalled by a different thread, and an EntryRep is created for 
>> every Entry written into the space.
>>
>> Previously I'd only seen similar test failures on Arm, but now I 
>> could observe it on Windows, the platform so far least affected by 
>> concurrency issues.
>>
>> com/sun/jini/test/impl/outrigger/leasing/UseNotifyLeaseTest.td
>>
>> How did I find it?
>>
>> An unrelated class com.sun.jini.outrigger.TypeTree used the data 
>> structure, Hashtree<String,Vector<String>> internally, to cache all 
>> subclasses, I replaced the data structure with
>> ConcurrentMap<String,Set<String>>, which simplified the code 
>> somewhat, this also allowed the unrelated EntryRep to fail on 
>> Window's where previously it wasn't evident.
>>
>> The good news is, it even failed while being observed with visualvm.  
>> It appears that the test was running well until hotspot optimised 
>> reflective method invocation, after that, the EntryRep array contents 
>> went missing and the test subsequently failed because the Watchers no 
>> longer matched the EntryRep and didn't send any more event 
>> notifications.
>>
>> I've just committed the fix, feel free to reverse the changes to 
>> EntryRep and play around with unsafe publication.
>>
>> Anyone seen any strange behaviour writing Entry's to the space in 
>> deployment?  Eg entry's going missing, not matching, or update 
>> notifications not occurring?  It's likely this could have been 
>> confused with network failure, which the Jini infrastructure handles 
>> quite well.
>>
>> Regards,
>>
>> Peter.
>>
>>
>>
>>
>>
>


Mime
View raw message