mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: [jira] Commented: (MAHOUT-34) Iterator interface for Vectors
Date Sat, 12 Apr 2008 22:19:27 GMT

I am not at all convinced about the need for generics for matrices.

First off, complex matrices are nice, but pretty rarely used in practice.
They come up a bit in electrical engineering, but real-only implementations
of those algorithms isn't soo hard, nor is duplicating the API (once).

If it really were a common thing to do, and if there were many types that
were important, and if generics came for free, then it might be a good
thing.  I don't think that any of those are true.  It is important to avoid
some of the old Fortran problems where they had special purpose routines for
{sparse, dense, banded, triangular} x {float, double, complex-float,
complex-double}.  Having 16 versions of everything and having to use 6
letters to identify them was a nightmare.  In our code, structure is safely
polymorphic, floats are almost never necessary and complex numbers are
relatively rare.

I vote that we put off generic implementations for a time.

On 4/11/08 7:03 PM, "Samee Zahur (JIRA)" <> wrote:

>     [ 
> .system.issuetabpanels:comment-tabpanel&focusedCommentId=12588184#action_12588
> 184 ] 
> Samee Zahur commented on MAHOUT-34:
> -----------------------------------
> Well the reason I added VectorPair is that I was going through the MAHOUT-20
> codes and sought to remove every single for loop there :p I'm not sure what
> you mean by "static inner classes of the test" but yes, the thought of putting
> VectorPairElement and VectorPairIterator as inner classes to VectorPair did
> occur to me. At the time, I guess I just wanted to keep each file short and
> simple. And the thing is I really didn't see anything gained by making
> VectorPairElement inner. As for VectorPairIterator, we can simply make it a
> package private rather than inner, as users are supposed to access it as
> Iterator<VectorPairElement> anyway. That's pretty much all the factoring I
> could come up with.
> And yes, in fact more generic solutions are possible. But the most elegant
> ones I could come up with entailed modifying the Vector interface by making it
> use Java generics, like java.lang do. Presently the Vectors are tied down to
> Double values (might even be useful if we need complex-valued vectors later).
> That would allow the users to use a more or less uniform interface, and might
> even allow the "Element" class to be factored out. Problems with such a change
> might be:
> The Iterator logics used in each (VectorPair, SparseVector, DenseVector) are
> quite distinct and hard to specify from a generic class given that Java do not
> support the kind of template specialization features of C++. But still, I
> guess doable by implementing a generic interface by concrete classes. Would
> enable the users to write codes like:
> VectorIterator<Double> it = sparsevec.iterator();
> VectorIterator<Complex> jt = cmplxdensevec.iterator();
> VectorIterator<Pair<Double,Complex>> kt =
> Vector.pairiterator(sparsevec,cmplxdensevec);
> Could design it this way if you want me to. But is there any specific
> redundancy in these Iterator classes that you have in mind? And do bear in
> mind this would probably mean changing the Vector interface to a generic one -
> meaning other classes that depend on it will have to be tweaked accordingly
> (probably just by simply replacing Vector by Vector<Double>). This is why I
> opted in for the simpler designs here.
>> Iterator interface for Vectors
>> ------------------------------
>>                 Key: MAHOUT-34
>>                 URL:
>>             Project: Mahout
>>          Issue Type: New Feature
>>            Reporter: Samee Zahur
>>            Assignee: Karl Wettin
>>         Attachments: VectorIterator.3.patch.bz2,
>> VectorIterator.patch.2.tar.bz2, VectorIterator.patch.tar.bz2
>> Implemented an Iterator interface for the Vector classes. Was necessary for
>> porting from Float[] used in some parts of the code.

View raw message