mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Interpreting the output of SVD
Date Tue, 23 Nov 2010 08:05:23 GMT
On Mon, Nov 22, 2010 at 11:46 PM, Dan Brickley <> wrote:
> 2010/11/22 Fernando Fernández <>:
>> Lance,
>> Columns of U are in some contexts called "latent factors". For example, if
>> we are applying SVD over a Document(User)-Term(Items) matrix, Columns of U
>> could be interpreted as a representation of groups of terms (words that have
>> similar meaning or tend to appear together in documents of the same kind, so
>> in this case this "latent" factors are "topics" in some way. Another example
>> of this is when we apply the SVD factorization in the famous movie
>> recommendation problem. The "latent" factors (columns of the U matrix)
>> represent somewhat some kind of "movie topics" (Drama, terror, comedy, and
>> possible combinations of these...). Note that if we are trying to make
>> recommendations of movies, we will recommend movies that has a similar
>> topic, i.e. we will recommend probably a whole topic, not an specific
>> movie... but SVD helps us find what movies fall into that topic. Note that
>> this "topic" could be in fact something more abstract than "Drama" or
>> "comedy".
> Naming these seems a fun project; the examples in
> made me
> smile... but also illustrate the point.
> 'Offbeat / Dark-Comedy' vs 'Mass-Market / 'Beniffer' Movies'; 'Good'
> vs 'Twisted'; 'What a 10 year old boy would watch' vs 'What a liberal
> woman would watch'...
> cheers,
> Dan

It's been done:

Just wonderful- matching your local zip codes to rentals is a hoot.

Lance Norskog

View raw message