[ https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092099#comment-13092099 ]
Ted Dunning commented on MAHOUT-792:
------------------------------------
{quote}
But my response has always been, if you have input thinner than projection, then why even use projection?
{quote}
Because it is cheaper to process a thin dense matrix than a wide sparse one.
Likewise, even if B is bigger than A, it can be much cheaper to compute the SVD of B than of A if only because the Cholesky trick works on the skinny dense case. If the Cholesky decomposition loses some accuracy then a QR or LQ decomposition could be used the same way.
> Add new stochastic decomposition code
> -------------------------------------
>
> Key: MAHOUT-792
> URL: https://issues.apache.org/jira/browse/MAHOUT-792
> Project: Mahout
> Issue Type: New Feature
> Reporter: Ted Dunning
> Attachments: MAHOUT-792.patch, MAHOUT-792.patch, sd-2.pdf
>
>
> I have figured out some simplification for our SSVD algorithms. This eliminates the QR decomposition and makes life easier.
> I will produce a patch that contains the following:
> - a CholeskyDecomposition implementation that does pivoting (and thus rank-revealing) or not. This should actually be useful for solution of large out-of-core least squares problems.
> - an in-memory SSVD implementation that should work for matrices up to about 1/3 of available memory.
> - an out-of-core SSVD threaded implementation that should work for very large matrices. It should take time about equal to the cost of reading the input matrix 4 times and will require working disk roughly equal to the size of the input.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira