river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject The importance of safe publication
Date Mon, 03 Jun 2013 22:13:07 GMT
Found a beaut bug, this time it relates to 
com.sun.jini.outrigger.EntryRep, this is what I think's occurring on the 
client side.

During construction arrays were created, written to volatile variables, 
then populated with values.

Now EntryRep uses default serialization, it isn't synchronized if 
marshalled by a different thread, and an EntryRep is created for every 
Entry written into the space.

Previously I'd only seen similar test failures on Arm, but now I could 
observe it on Windows, the platform so far least affected by concurrency 


How did I find it?

An unrelated class com.sun.jini.outrigger.TypeTree used the data 
structure, Hashtree<String,Vector<String>> internally, to cache all 
subclasses, I replaced the data structure with
ConcurrentMap<String,Set<String>>, which simplified the code somewhat, 
this also allowed the unrelated EntryRep to fail on Window's where 
previously it wasn't evident.

The good news is, it even failed while being observed with visualvm.  It 
appears that the test was running well until hotspot optimised 
reflective method invocation, after that, the EntryRep array contents 
went missing and the test subsequently failed because the Watchers no 
longer matched the EntryRep and didn't send any more event notifications.

I've just committed the fix, feel free to reverse the changes to 
EntryRep and play around with unsafe publication.

Anyone seen any strange behaviour writing Entry's to the space in 
deployment?  Eg entry's going missing, not matching, or update 
notifications not occurring?  It's likely this could have been confused 
with network failure, which the Jini infrastructure handles quite well.



View raw message