mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eshwaran Vijaya Kumar <>
Subject Computing SVD Of "Large Sparse Data"
Date Fri, 03 Jun 2011 23:48:48 GMT
Hello all,  
  We are trying to build a clustering system which will have an SVD component. I believe Mahout
has two SVD solvers: DistributedLanczosSolver and SSVD. Could someone give me some tips on
which would be a better choice of a solver given that the size of the data will be roughly
100 million rows with each row having roughly 50 K dimensions (100 million X 50000 ). We will
be working with text data so the resultant matrix should be relatively sparse to begin with.

View raw message