river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MICHAEL MCGRADY <mmcgr...@topiatechnology.com>
Subject Re: datastructure classes
Date Tue, 14 Dec 2010 07:32:39 GMT
Patricia

http://www.cse.lehigh.edu/~spear/
http://www.cs.rochester.edu/u/vmarathe/
http://www.cs.rice.edu/~wns1/
http://www.cs.rochester.edu/u/scott/papers/2003_FTDCS_IW.pdf

A few other sources that are not so antiquated, which is not a negative.  I have asked Prof
Scott to share his latest research.

MG

On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote:

> You make a good point that I should take a look around to see if I can find anything
suitable that already exists, with appropriate licensing.
> 
> That said, I believe the requirements for an outrigger FastList are:
> 
> 1. Highly scalable performance. In particular, iterating over the list should be fast
even when many threads are doing it. As far as I can tell, any JavaSpace lookup turns into
iterating over a FastList representing items of acceptable classes, to see if any of the nodes
match the other template requirements.
> 
> 2. A limited set of features: Iterate over the list starting at the head, remove a node,
add at tail.
> 
> The two libraries each go in a different direction.
> 
> Javolution is primarily a single thread speed play, with fast and predictable performance.
It is what I would want if I were writing real time code. It does have some support for concurrency,
but not what is needed for outrigger: "Fast collections (or maps) can be made thread-safe
by marking them FastCollection#shared shared Having a shared collection (or map) means that
it can be safely used without synchronization. It does not mean that a change made by a thread
is automatically viewed by another thread (that would require synchronizing)."
> 
> The Apache Commons collections are very feature rich, but seem to be primarily single
thread, with the synchronizing decorator approach to concurrency.
> 
> I will do some more web searching, but my current strongest contender for an existing
class is java.util.concurrent.ConcurrentLinkedQueue. It's main limitation is slow remove,
but that might be worked around by using a pair of queues, and periodically cleaning by copying
from one queue to another. Also, java.util.concurrent contains a reference to an important
paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting point for
finding the latest research on these issues.
> 
> Patricia
> 
> 
> 
> 
> On 12/13/2010 4:30 PM, James Grahn wrote:
>> Because there've been a few concerns about memory models and concurrent
>> data structures, I thought I'd suggest a couple of resources. Library
>> usage is preferable to reinventing the wheel, where possible:
>> 
>> First:
>> http://javolution.org/
>> 
>> I haven't used it directly; I've only used a library which used it. But
>> it perhaps deserves a scan to see if it would meet our requirements (and
>> testing to ensure it's acceptable if it claims to).
>> 
>> Also, it's BSD-licensed. I believe that's acceptable?
>> 
>> Second:
>> http://commons.apache.org/collections/index.html
>> 
>> I don't *think* it has what we need, but it's worth poking around in
>> before creating a new datastructure. Has the advantage of being part of
>> the Apache family.
>> 
>> jamesG
>> 
>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>> Gregg Wonderly wrote:
>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>> 
>>>>> I am not sure that I have time to really look over this code. I
>>>>> wonder if anyone
>>>>> knows if this is relatively new code that John put together as part
>>>>> of the
>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>> "non-persistent" javaspaces implementation?
>>>>> 
>>>>> Perhaps we need to do something different here, a segmented list for
>>>>> example,
>>>>> which is what PSE did with it's Vector implementation so that
>>>>> segments of the
>>>>> list could be locked independently, as well as allowing the segments
>>>>> to be
>>>>> "deleted" from disk once they were "empty".
>>>> 
>>>> And, of course if we pull out outrigger, as an application/service,
>>>> separate from the Jini part of river, we could just say that Outrigger
>>>> requires JDK1.5 so that we can move to a different concurrency
>>>> implementation if that is needed, using the new memory model.
>>> 
>>> It seems like a lot of work to fix a package private implementation,
>>> already based on flawed assumptions (Patricia's done a great job
>>> debugging this, she's a real asset, I look forward to learning more
>>> debugging tips). I suspect we'll get a much better result starting from
>>> scratch, utilising Java 6.
>>> 
>>> Since River is a Jini platform, why don't we start by creating an
>>> independent implementation of outrigger utilising any latest available
>>> java features. Not only will this produce a better implementation, that
>>> will be easier to support, but it might improve our understanding of
>>> what's required for a modular build as well.
>>> 
>>> The existing outrigger implementation can remain as it is, but
>>> deprecated, left in place for legacy support.
>>> 
>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>> you like:
>>> 
>>> org.apache.river.impl.util.*
>>> 
>>> ConcurrentCollections
>>> ConcurrentSoftMap
>>> ConcurrentWeakIdentityMap
>>> ConcurrentWeakMap
>>> 
>>> ConcurrentCollections is a multi read single write lock based collection
>>> wrapper. It defensively copies for Iterators but still allows performing
>>> removals from the underlying collection.
>>> 
>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>> ReferenceQueue to remove stale entries prior to every method call.
>>> Cheers,
>>> 
>>> Peter.
>>> 
>>>> 
>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>> allows a "get" operation to have a return value before anything is in
>>>> the space. A server side thread, then looks through the space for
>>>> matches and adds them to the return queue for the calling "get". This
>>>> allows server side contention to be managed to some degree because the
>>>> number of "searching" threads could be held to an appropriate minimum
>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>> kind of thing more literally.
>>>> 
>>>> Ultimately, I think a segmented list that looks like a
>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>> because iteration would be less likely to be on the same segment at
>>>> the same time.
>>>> 
>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>> adds until it gets to a particular size, and then turn its contents
>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>> You can choose then to decide when to do that movement by watching for
>>>> the other lists to be empty too, or when a traversal gets to the end
>>>> of what is in a list already.
>>>> 
>>>> This keeps writers quite free to insert quickly and completely away
>>>> from the readers as well. Ordering presents most of the opportunities
>>>> for big time contention. Unordered (map), or more segmented (map of
>>>> list) construction relieves some of the contention if course, by
>>>> distributing it.
>>>> 
>>>> Gregg
>>>> 
>>>>> Gregg
>>>>> 
>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>> Patricia Shanahan wrote:
>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>> ...
>>>>>>>> > The important issue in FastList is that it was written
with the
>>>>>>>> JDK1.4
>>>>>>>> > memory model. After moving River to Java 1.5, we'd have
the
>>>>>>>> JSR166 work
>>>>>>>> > and the new, consistent memory model where volatile
has a true
>>>>>>>> meaning.
>>>>>>>> > However, this code in particular is quite complex as
you have
>>>>>>>> noted, so
>>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>> 
>>>>>>>> I've just run a modified, simplified version of my test with
java
>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and
I still
>>>>>>>> get
>>>>>>>> the NullPointerException. This changes my thinking a bit.
I had been
>>>>>>>> working from the assumption that the issue was to do with
the
>>>>>>>> changes
>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider
the
>>>>>>>> possibility of a more basic bug that is independent of the
memory
>>>>>>>> model.
>>>>>>>> 
>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>> who
>>>>>>>> would like to help, please reply. I would welcome another
set of
>>>>>>>> eyes
>>>>>>>> on the code, and a cross check on my conclusions so far about
how
>>>>>>>> FastList is supposed to work. There seems to be a critical
invariant
>>>>>>>> that gets broken, and once that happens we are on track to
either a
>>>>>>>> NullPointerException or dropped items.
>>>>>>>> 
>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and
as a main
>>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is
fire up a
>>>>>>>> mixture of threads that repeatedly add items to a FastList
and
>>>>>>>> threads
>>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>> 
>>>>>>>> If I don't nail this problem fairly soon, I may abandon the
current,
>>>>>>>> rather complicated, code and switch to writing a concurrent
high
>>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>> 
>>>>>>>> Patricia
>>>>>>>> 
>>>>>>> I'll have a look tonight, no promises though ;)
>>>>>> 
>>>>>> I'm attaching the simplified test application main program that can
>>>>>> run, and
>>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>> 
>>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>>> synchronization situations. However, fixing all the things I have
so
>>>>>> far found
>>>>>> of that type has only reduced the failure rate, not eliminated
>>>>>> failures. That
>>>>>> could be caused by changing timings without having any impact on
the
>>>>>> root cause.
>>>>>> 
>>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>>> with a call
>>>>>> to head() and proceeding through a series of calls to next() to
>>>>>> examine nodes.
>>>>>> The head() call sets up a guard node for the thread that was the
>>>>>> tail at some
>>>>>> point during the head call. The invariant is that the series of
>>>>>> next() calls
>>>>>> will reach the guard node before finding a null next pointer,
>>>>>> indicating the
>>>>>> actual tail.
>>>>>> 
>>>>>> The remove call does not really remove anything, it merely marks
the
>>>>>> node
>>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>>> during head and
>>>>>> next calls, but only if they are not guard nodes.
>>>>>> 
>>>>>> There are additional complications in the restart and reap methods,
>>>>>> but we can
>>>>>> ignore them for now - my test does not use them.
>>>>>> 
>>>>>> Once a guard node is lost, the synchronization breaks down
>>>>>> completely, because
>>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>>> FastList instance,
>>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>>> extent it is
>>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>>> that is being
>>>>>> removed.
>>>>>> 
>>>>>> The commonest failure symptom is a scan reaching the null next
>>>>>> pointer at the
>>>>>> end of the FIFO during head(), without first finding the guard node
>>>>>> it just set
>>>>>> up. An alternative form of failure is loss of some entries - they
>>>>>> get added, but
>>>>>> the remove threads never find them. The second symptom is
>>>>>> predominant in the
>>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>>> next pointer
>>>>>> could cause either.
>>>>>> 
>>>>>> Incidentally, I'm curious about why it has such a fragile system
in
>>>>>> which the
>>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>>> obvious
>>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>>> find out if a
>>>>>> remove call succeeded or not (the node may have been removed by
>>>>>> another thread),
>>>>>> but that could be done in an interface extending Iterator. The
>>>>>> WeakHashMap in a
>>>>>> node that keeps track of the threads for which it is a guard would
>>>>>> instead track
>>>>>> the Iterator. There would be no need for thread local storage, the
>>>>>> same data
>>>>>> could be kept in the Iterator.
>>>>>> 
>>>>>> Thanks for any time you can spend looking at this.
>>>>>> 
>>>>>> Patricia
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com


Mime
View raw message