mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brickley <dan...@danbri.org>
Subject Re: Interpreting the output of SVD
Date Tue, 23 Nov 2010 07:46:28 GMT
2010/11/22 Fernando Fernández <fernando.fernandez.gonzalez@gmail.com>:
> Lance,
>
> Columns of U are in some contexts called "latent factors". For example, if
> we are applying SVD over a Document(User)-Term(Items) matrix, Columns of U
> could be interpreted as a representation of groups of terms (words that have
> similar meaning or tend to appear together in documents of the same kind, so
> in this case this "latent" factors are "topics" in some way. Another example
> of this is when we apply the SVD factorization in the famous movie
> recommendation problem. The "latent" factors (columns of the U matrix)
> represent somewhat some kind of "movie topics" (Drama, terror, comedy, and
> possible combinations of these...). Note that if we are trying to make
> recommendations of movies, we will recommend movies that has a similar
> topic, i.e. we will recommend probably a whole topic, not an specific
> movie... but SVD helps us find what movies fall into that topic. Note that
> this "topic" could be in fact something more abstract than "Drama" or
> "comedy".

Naming these seems a fun project; the examples in
http://www.timelydevelopment.com/demos/NetflixPrize.aspx made me
smile... but also illustrate the point.

'Offbeat / Dark-Comedy' vs 'Mass-Market / 'Beniffer' Movies'; 'Good'
vs 'Twisted'; 'What a 10 year old boy would watch' vs 'What a liberal
woman would watch'...

cheers,

Dan

Mime
View raw message