mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-792) Add new stochastic decomposition code
Date Fri, 26 Aug 2011 22:31:29 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092099#comment-13092099
] 

Ted Dunning commented on MAHOUT-792:
------------------------------------

{quote}
But my response has always been, if you have input thinner than projection, then why even
use projection?
{quote}
Because it is cheaper to process a thin dense matrix than a wide sparse one.

Likewise, even if B is bigger than A, it can be much cheaper to compute the SVD of B than
of A if only because the Cholesky trick works on the skinny dense case.  If the Cholesky decomposition
loses some accuracy then a QR or LQ decomposition could be used the same way.

> Add new stochastic decomposition code
> -------------------------------------
>
>                 Key: MAHOUT-792
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-792
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-792.patch, MAHOUT-792.patch, sd-2.pdf
>
>
> I have figured out some simplification for our SSVD algorithms.  This eliminates the
QR decomposition and makes life easier.
> I will produce a patch that contains the following:
>   - a CholeskyDecomposition implementation that does pivoting (and thus rank-revealing)
or not.  This should actually be useful for solution of large out-of-core least squares problems.
>   - an in-memory SSVD implementation that should work for matrices up to about 1/3 of
available memory.
>   - an out-of-core SSVD threaded implementation that should work for very large matrices.
 It should take time about equal to the cost of reading the input matrix 4 times and will
require working disk roughly equal to the size of the input.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message