river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: Space/outrigger suggestions (remote iterator vs. collection)
Date Wed, 19 Jan 2011 10:01:07 GMT
Dan Creswell wrote:
> Note too that iterator does not support remote semantics whereas MatchSet
> does (all those explicit RemoteExceptions etc).
>
> An iterator as defined for the Java platform might fail for concurrency
> reasons (although notably it doesn't for many of the modern concurrent
> collections) or because an operation (typically remove) is not supported.
>   
Yes I bumped into this recently when creating a concurrent policy 
implementation, although it was with Enumeration, the backing set cannot 
be modified while the Enumeration is being read from a loop, the same 
with the iterator.

Although this is for local code only, I've got a Multi Read / Single 
Write Collections utility, similar to the Synchronized Collections 
utility in skunk/pepe.  Iterators are generated from a copy of the 
encapsulated collection, but when remove is called, the call is 
redirected to the underlying collection, and requires the write lock.  
So Enumerators and Iterators are up to date at their creation time, but 
become stale quickly.

When access large disk records and pulling these into memory, you 
typically only take a small chunk, process it and move on, so the 
behaviour is more stream like, rather than iterator, I created an 
interface called ResultStream (yes it supports Generics, but beware the 
compilation boundaries that haven't been checked by the compiler), 
result stream is terminated by a null value.

So you can process very large amounts of data, in small doses, in a 
loop, like this one, which is actually also an implementation of 
ResultStream.get() and performs filtering operations:

    public ServiceItem get() {
        for(Object item = inputResultStream.get(); item != null; item = 
inputResultStream.get()) {
            if (item instanceof ServiceItem){
                ServiceItem it = (ServiceItem) item;
                int l = filters.size();
                for ( int i = 0; i < l; i++){
                    ServiceItemFilter filter = filters.get(i);
                    if (filter == null) continue;
                    if (filter.check(it))  return it;
                }// end filter loop
            }// If it isn't a ServiceItem it is ignored.
        }//end item loop
        return null; // Our stream terminated item was null;

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership. The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.river.api.util;

/**
 * This interface is similar to an Enumerator, it is designed to return
 * results incrementally in loops, however unlike an Enumerator, there is no
 * check first operation as implementors must return a null value after
 * the backing data source has been exhausted. So this terminates like a 
stream
 * by returning a null value.
 *
 * @author Peter Firmstone
 */
public interface ResultStream<T> {
    /**
     * Get next T, call from a loop until T is null;
     * @return T unless end of stream in which case null is returned.
     */
    public T get();
    /**
     * Close the result stream, this allows the implementer to close any
     * resources prior to deleting reference.
     */
    public void close();
}

> MatchSet semantics are substantially different. I'm sure it's still possible
> to do a wrapper around MatchSet that looks like an iterator but there will
> be some implementation cracks to pain over in respect of hiding away
> exceptions and so on.
>
> Dan.
>
> On 19 January 2011 07:54, Patricia Shanahan <pats@acm.org> wrote:
>
>   
>> I don't think we should commit to a single class doing both Iterable and
>> Iterator. An Iterable is already committed to being able to supply an
>> Iterator on demand, but often the Iterator implementation is better done as
>> a private class member of the Iterable. Note that an Iterable needs to be
>> able to supply a new Iterator each time its iterator() method is called.
>>
>> I'm not sure what you are saying about combining Iterator and the MatchSet
>> features. My inclination would be to keep each interface simple and clean.
>> Many classes will implement Iterable and appropriate interfaces representing
>> the snapshot and lease capabilities.
>>
>> As you can probably guess from the length of this reply, I'm back from
>> Egypt and have a full keyboard, not just an iPhone.
>>
>> Patricia
>>
>>
>>
>> James Grahn wrote:
>>
>>     
>>> I should also add, we'd likely need to derive our own class extending
>>> Iterable & Iterator to avoid losing existing MatchSet methods of getSnapshot
>>> and getLease.
>>>
>>> I don't see an immediate problem with this; a collection-backed
>>> Iterable/Iterator would always have a null Lease, correct?
>>>
>>> jamesG
>>>
>>> On 1/18/2011 5:38 PM, James Grahn wrote:
>>>
>>>       
>>>> It (finally) occurred to me that we can have our cake and eat it too in
>>>> this case.
>>>>
>>>> We can have the sweet deliciousness of API symmetry and retain the
>>>> implementation advantages of remote iterator & collection by having both
>>>> take-multiple and contents return:
>>>> Iterable.
>>>>
>>>>
>>>>         
>
>   


Mime
View raw message