mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Mahout > Stochastic Singular Value Decomposition
Date Sat, 26 Nov 2011 00:24:00 GMT
Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT)
Page: Stochastic Singular Value Decomposition (https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Singular+Value+Decomposition)


Edited by Grant Ingersoll:
---------------------------------------------------------------------
Stochastic SVD method in Mahout produces reduced rank Singular Value Decomposition output
in its strict mathematical definition: A=USV'.

h5. The benefits over other methods are: 
* reduced flops required compared to Krylov subspace methods
* In map-reduce world, a fixed number of MR iterations required regardless of rank requested
* Tweak precision/speed balance with options.
* A is a Distributed Row Matrix where rows may be identified by any Writable (such as a document
path). As such, it would work directly on the output of seq2sparse. 

map-reduce characteristics: 
SSVD uses at most 3 MR steps (map-only + map-reduce + optional map-reduce) to produce reduced
rank approximation of U, V and S matrices. Additionally, two more map-reduce steps are added
for each power iteration step if requested.

h5. Potential drawbacks: 
* potentially less precise (but adding even one power iteration seems to fix that quite a
bit).

h5. Documentation

[Overview and Usage|^SSVD-CLI.pdf]

Note: Please use 0.6 trunk or later!

(Todo: add a tutorial example.)

Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action
   

Mime
View raw message