mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-206) Separate and clearly label different SparseVector implementations
Date Tue, 24 Nov 2009 17:38:39 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782054#action_12782054
] 

Grant Ingersoll commented on MAHOUT-206:
----------------------------------------

Jake, there's something weird in this patch in regards to SparseVector.  It didn't delete
the file, but instead left it empty.

It seems like there is still some commonality between the two implementations (size, cardinality,
etc.) that I think it would be worthwhile to keep SparseVector as an abstract class which
the other two extend.

> Separate and clearly label different SparseVector implementations
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-206
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-206
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Matrix
>    Affects Versions: 0.2
>         Environment: all
>            Reporter: Jake Mannix
>            Assignee: Grant Ingersoll
>             Fix For: 0.3
>
>         Attachments: MAHOUT-206.patch
>
>
> Shashi's last patch on MAHOUT-165 swapped out the int/double parallel array impl of SparseVector
for an OpenIntDoubleMap (hash-based) one.  We actually need both, as I think I've mentioned
a gazillion times.
> There was a patch, long ago, on MAHOUT-165, in which Ted had OrderedIntDoubleVector,
and OpenIntDoubleHashVector (or something to that effect), and neither of them are called
SparseVector.  I like this, because it forces people to choose what kind of SparseVector they
want (and they should: sparse is an optimization, and the client should make a conscious decision
what they're optimizing for).  
> We could call them RandomAccessSparseVector and SequentialAccessSparseVector, to be really
obvious.
> But really, the important part is we have both.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message