Author: buildbot
Date: Sat Mar 8 06:18:24 2014
New Revision: 900525
Log:
Staging update by buildbot for mahout
Modified:
websites/staging/mahout/trunk/content/ (props changed)
websites/staging/mahout/trunk/content/users/dimreduction/ssvd.html
Propchange: websites/staging/mahout/trunk/content/

 cms:sourcerevision (original)
+++ cms:sourcerevision Sat Mar 8 06:18:24 2014
@@ 1 +1 @@
1575489
+1575490
Modified: websites/staging/mahout/trunk/content/users/dimreduction/ssvd.html
==============================================================================
 websites/staging/mahout/trunk/content/users/dimreduction/ssvd.html (original)
+++ websites/staging/mahout/trunk/content/users/dimreduction/ssvd.html Sat Mar 8 06:18:24
2014
@@ 294,15 +294,15 @@ As of 0.7 trunk, includes PCA and dimens
mapreduce characteristics:
SSVD uses at most 3 MR sequential steps (maponly + mapreduce + 2 optional parallel mapreduce
jobs) to produce reduced rank approximation of U, V and S matrices. Additionally, two more
mapreduce steps are added for each power iteration step if requested.</p>
<p><strong>Potential drawbacks:</strong></p>
<p>potentially less precise (but adding even one power iteration seems to fix that
quite a bit).
Documentation
Overview and Usage
+<p>potentially less precise (but adding even one power iteration seems to fix that
quite a bit).</p>
+<p><strong>Documentation</strong></p>
+<p><a href="ssvd.page/ssvd.pdf">Overview and Usage</a>
Note: Please use 0.6 or later! for PCA workflow, please use 0.7 or later.</p>
<p><strong>Publications</strong></p>
<p><a href="http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf">Nathan
Halko's dissertation</a> "Randomized methods for computing lowrank
approximations of matrices" contains comprehensive definition of parallelization strategy
taken in Mahout SSVD implementation and also some precision/scalability benchmarks, esp. w.r.t.
Mahout Lanczos implementation on a typical corpus data set.</p>
<p><strong>R simulation</strong></p>
<p>Nonparallel SSVD simulation in R with power iterations and PCA options. Note that
this implementation is not most optimal for sequential flow solver, but it is for demonstration
purposes only.</p>
+<p><a href="ssvd.page/ssvd.R">Nonparallel SSVD simulation in R</a> with
power iterations and PCA options. Note that this implementation is not most optimal for sequential
flow solver, but it is for demonstration purposes only.</p>
<p>However, try this R code to simulate a meaningful input:</p>
<div class="codehilite"><pre> tests.R
n<span class="o"><</span><span class="m">1000</span>
