mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Lucene Mahout > SVD - Singular Value Decomposition
Date Sat, 09 Jan 2010 20:06:00 GMT
Space: Apache Lucene Mahout (http://cwiki.apache.org/confluence/display/MAHOUT)
Page: SVD - Singular Value Decomposition (http://cwiki.apache.org/confluence/display/MAHOUT/SVD+-+Singular+Value+Decomposition)


Edited by Ted Dunning:
---------------------------------------------------------------------
Singular Value Decomposition is a form of product decomposition of a matrix in which a rectangular
matrix A is decomposed into a product U s V' where U and V are orthonormal and s is a diagonal
matrix.  The values of A can be real or complex, but the real case dominates applications
in machine learning.  The most prominent properties of the SVD are:

  * The decomposition of any real matrix has only real values
  * The SVD is unique except for column permutations of U, s and V
  * If you take only the largest n values of s and set the rest to zero, you have a least
squares approximation of A with rank n.  This allows SVD to be used very effectively in least
squares regression and makes partial SVD useful.
  * The SVD can be computed accurately for singular or nearly singular matrices.  For a matrix
of rank n, only the first n singular values will be non-zero.  This allows SVD to be used
for solution of singular linear systems.  The columns of U and V corresponding to zero singular
values define the null space of A.
  * The partial SVD of very large matrices can be computed very quickly using stochastic decompositions.
 See http://arxiv.org/abs/0909.4061v1 for details.  Gradient descent can also be used to compute
partial SVD's and is very useful where some values of the matrix being decomposed are not
known.

In collaborative filtering and text retrieval, it is common to compute the partial decomposition
of the user x item interaction matrix or the document x term matrix.  This allows the projection
of users and items (or documents and terms) into a common vector space representation that
is often referred to as the latent semantic representation.  This process is sometimes called
Latent Semantic Analysis and has been very effective in the analysis of the Netflix dataset.

{excerpt}Is a set of techniques dealing with sets of equations or matrices that are either
singular or numerically very close to singular is the so-called singular value decomposition
(SVD). SVD allows one to diagnose the problems in a given matrix and provides numerical answer
as well.{excerpt}

 See Also:
 * http://www.kwon3d.com/theory/jkinem/svd.html
 * http://en.wikipedia.org/wiki/Singular_value_decomposition
 * http://en.wikipedia.org/wiki/Latent_semantic_analysis
 * http://en.wikipedia.org/wiki/Netflix_Prize
 * http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm

Change your notification preferences: http://cwiki.apache.org/confluence/users/viewnotifications.action
   

Mime
View raw message