commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Barker" <billwbar...@verizon.net>
Subject Re: [Math] "iterator" and "sparseIterator" in "RealVector" hierarchy
Date Mon, 15 Aug 2011 01:38:16 GMT
I'm in favor of moving some methods to the SparseXXX interface, but got 
voted down at the time. For application developers (like me), that can 
expect in advance if the Vector/Matrix is sparse or not it isn't a big deal. 
But I can see how it may cause problems for other libraries that want to 
leverage C-M.  And actually, having problems seeing why it is a big deal in 
general.  If I'm doing an operation like outer product, I would still prefer 
that the iterator skips the zero entries.

-----Original Message----- 
From: Gilles Sadowski
Sent: Sunday, August 14, 2011 3:52 PM
To: dev@commons.apache.org
Subject: [Math] "iterator" and "sparseIterator" in "RealVector" hierarchy

Hi.

I'm rather confused by the appearance of sparseness handling at the level
of a "general" vector (i.e. "RealVector", "AbstractRealVector") as well as
at the level of a non-sparse data structure ("ArrayRealVector").

This makes for a lot of convoluted code containing "instanceof" operators...

It was also pointed out by Arne Plöse in
  https://issues.apache.org/jira/browse/MATH-626
that it could lead to inefficient code.

Following the suggestion there, I wonder whether we should perform some
cleanup of the "RealVector" hierarchy, such as moving methods that are
sparseness-related over to the "SparseRealVector" interface, and removing
anything sparseness-related from "AbstractRealVector" and "ArrayRealVector".

However I don't have any idea of the implications of this refactoring. The
documentation is not very explicit about why the sparseness was introduced
in "RealVector" and the Javadoc for "sparseIterator()" adds to the
confusion:
---CUT---
    /**
     * Specialized implementations may choose to not iterate over all
     * dimensions, either because those values are unset, or are equal
     * to defaultValue(), or are small enough to be ignored for the
     * purposes of iteration.
     * No guarantees are made about order of iteration.
     * In dense implementations, this method will often delegate to
     * {@link #iterator()}.
     *
     * @return a sparse iterator
     */
    Iterator<Entry> sparseIterator();
---CUT---

All dimensions (plural)? "defaultValue()"? Order of iteration?

Also, I would expect that an iterator in "RealVector" would by default
iterate over all indices and return the sequence of _values_ (double), not
of "Entry" objects.
I assume that those "Entry" objects are necessary only for implementations
of sparse vectors to avoid returning the many entries set to the default
(non-stored) value...
In fact, don't you think that the "RealVector" interface should extend
"Iterable<Double>" while "SparseRealVector" would extend "Iterable<Entry>"?


Best regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message