commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Horman" <ja...@jhorman.org>
Subject RE: [functor] Iterator and Generator
Date Sat, 19 Jul 2003 06:32:43 GMT
I think there are a few reasons that generators may be 
better than or a compliment to iterators.

They are somewhat easier to write. You literally just 
write a loop that calls a function on each iteration. Writing
a generator as an anon inner class is easier since there is
only 1 method to implement. The notion of generators
is taken from Python where you can write something like
this:

def loop
  for x in range(0, 100):
    if x % 2:
      yield(x);

for x in loop(): .....

If not for stop(), generators have no state and could be
re-run/re-used, even in a multithreaded env. Some generators
may not even care about stop() and could be reused as is.
Iterators require state.

One of the best things about the generators is the ability
to nest "algorithms".

gen.apply(func).filter(pred).apply(func2);

And the best part is that no intermediate collections need
to be created.

I think the we may find that the iterator approach doesn't
handle this very well. Think about how the new select would 
work. It would look something like this:

public static final Generator select(gen, pred) {
  return new BaseGenerator {
    public Object next() {
      Object obj = gen.next();
      while(!pred.test(obj) && gen.hasNext()) {
        obj = gen.next();
      }

      // i guess stop here if done
      if (!gen.hasNext()) {
        stop();
      }

      return obj;
    }

    // if no pred's match in next() hasNext will have lied.
    public boolean hasNext() {
       return pred.hasNext();
    }     

    // responsibility to stop() is put back on the client.
    public void stop() {
       stopped = true;
       gen.stop();
    }
  }
}

hasNext() may have lied if the predicate matched no elements.

The client has to worry about calling stop(), don't they?
The code changes from:

Collection rlines = Algorithms.apply(new Lines(file), 
			  new SomeFunc()).toCollection();

to:

Lines lines = new Lines(file);
Collection rlines = null;
try {
  rlines = Algorithms.apply(lines, 
			     new SomeFunc()).toCollection();
} finally {
  lines.stop(); // to close the file
}

You can't really depend on stop() being called without the finally
block with the iterator. Compare the above code with the current 
Algorithms.select() and the old EachLine that handled its own 
try/finally and closed the file. I think the generator version is easier.

I like the inversion of control that generators introduce. Generators
control the pace of iteration. This may be important in some
applications. I can imagine a multithreaded webspider Generator 
for example.

I think that the IteratorToGenerator adapter provides the
iterator support many will need/want, and in addition we
have the benefit of keeping generators as well. Why not
support both?


Other points:

For iterators, in some cases (NumberRanges, Collections) stop() 
makes more sense than close(), but for databases and files close() 
definitely makes more sense. For generators stop() always makes
sense I think. Minor detail.

I do like having the "algorithms" in both Algorithms and in
BaseGenerator. For some reason I prefer:

Algorithms.apply(new gen(), new func()).filter(pred) 
over: gen.from(something).apply(func).filter(pred)

Anyway, since I wrote it I support it. I am definitely not against
iterators. I do see value in both approaches, and in supporting both.

-jason horman
 jason@jhorman.org


-----Original Message-----
From: Rodney Waldhoff [mailto:rwaldhoff@apache.org]
Sent: Friday, July 18, 2003 8:05 PM
To: commons-dev@jakarta.apache.org
Subject: [functor] Iterator and Generator


As far as I can tell, the role of generator is essentially that of an
Iterator, it provides a mechanism for doing something to each element in a
"collection" (not necessarily a Collection).  The major differences being:

* Generator has a "close" method (currently stop()).  This is used for
things that need to clean up after themselves a little bit, like closing
files or sockets.  EachLine was an example of this.

* Generator has convenience methods for "internal iteration"
(Algorithms.*)

* Generator doesn't have a next() function, currently it only exposes the
"internal iteration" methods like "run(UnaryProcedure)"

It seems to me that it is possible to simplify and unify these two
concepts with an implementation like the following:

interface Generator extends Iterator {
  /** "stops" this Generator, freeing any associated resources */
  void stop();

  // the "convenience methods", if desired at this level
  Generator apply(UnaryFunction f);
  boolean contains(UnaryPredicate p);
  Object detect(UnaryPredicate p);
  // etc.
}

abstract class BaseGenerator implements Generator {
  public abstract Object next();

  public void remove() {
    throw new UnsupportedOperationException();
  }

  public boolean hasNext() {
    return !(isStopped());
  }

  public void stop() {
    closed = true;
  }

  // insert implementations of the "convenience methods" here, e.g.,

  public void foreach(UnaryProcedure proc) {
    while(hasNext()) {
       proc.execute(next());
    }
  }

  /** note this method is protected here */
  protected boolean isStopped() {
    return closed;
  }

  protected void finalize() {
    if(!isStopped()) { stop(); }
  }

  private boolean closed = false;
}


Implementations of Generator would then look something like:

/**
 * This one is infinite, if you don't call
 * stop(), it'll generate for ever.
 */
class RandomIntegers extends BaseGenerator {
  public Object next() {
    return new Integer(random.nextInt());
  }

  private Random random = new Random();
}

or,

/**
 * This is finite and doesn't really need
 * to be manually stopped.  It's really no
 * better than an Iterator, except it adds
 * the convenience methods like:
 *  Elements.from(myArray).contains(myPredicate);
 */
class Elements extends BaseGenerator {
  public Elements(Object[] values) {
    this.values = values;
    this.next= 0;
  }

  public boolean hasNext() {
     return !isStopped() && (next < values.length);
  }

  public Object next() {
     return values[next++];
  }

  public static Elements from(Object[] values) {
     return new Elements(values);
  }

  /** You could override stop() here if you want: */
  public void stop() {
    values = null;
    super.stop();
  }

  private int next;
  private Object[] values;
}

or,

/**
 * This one actually needs to be stopped,
 * although the finalizer will protect
 * you a little bit.  (And we call
 * stop() internally if you iterate all
 * the way to the end of the stream.)
 *
 * You can treat this as an Iterator
 * (i.e., pass it off to a method that
 * expects an Iterator), but you'll
 * have to follow the "close what you open"
 * strategy and call stop() yourself.
 */
class Lines extends BaseGenerator {
  public Lines(BufferedReader in) {
    this.in = in;
  }

  public boolean hasNext() {
    return !isStopped() && (nextSet || setNext());
  }

  public Object next() {
    if(hasNext()) {
      nextSet = false;
      return next;
    } else {
      throw new NoSucheElementException();
    }
  }

  public void stop() {
    in.close();
    in = null;
    next = null;
    super.stop();
  }

  private boolean setNext() {
    next = in.readLine();
    if(null == next) {
      stop();
      nextSet = false;
    } else {
      nextSet = true;
    }
    return nextSet;
  }

  private boolean nextSet = false;
  private String next = null;
  private BufferedReader in;
}

etc.

Is there a disadvantage to doing it this way that I'm missing?

- Rod <http://radio.weblogs.com/0122027/>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message