mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Interpreting the output of SVD
Date Tue, 23 Nov 2010 08:05:23 GMT
On Mon, Nov 22, 2010 at 11:46 PM, Dan Brickley <danbri@danbri.org> wrote:
> 2010/11/22 Fernando Fernández <fernando.fernandez.gonzalez@gmail.com>:
>> Lance,
>>
>> Columns of U are in some contexts called "latent factors". For example, if
>> we are applying SVD over a Document(User)-Term(Items) matrix, Columns of U
>> could be interpreted as a representation of groups of terms (words that have
>> similar meaning or tend to appear together in documents of the same kind, so
>> in this case this "latent" factors are "topics" in some way. Another example
>> of this is when we apply the SVD factorization in the famous movie
>> recommendation problem. The "latent" factors (columns of the U matrix)
>> represent somewhat some kind of "movie topics" (Drama, terror, comedy, and
>> possible combinations of these...). Note that if we are trying to make
>> recommendations of movies, we will recommend movies that has a similar
>> topic, i.e. we will recommend probably a whole topic, not an specific
>> movie... but SVD helps us find what movies fall into that topic. Note that
>> this "topic" could be in fact something more abstract than "Drama" or
>> "comedy".
>
> Naming these seems a fun project; the examples in
> http://www.timelydevelopment.com/demos/NetflixPrize.aspx made me
> smile... but also illustrate the point.
>
> 'Offbeat / Dark-Comedy' vs 'Mass-Market / 'Beniffer' Movies'; 'Good'
> vs 'Twisted'; 'What a 10 year old boy would watch' vs 'What a liberal
> woman would watch'...
>
> cheers,
>
> Dan
>

It's been done:
http://www.nytimes.com/interactive/2010/01/10/nyregion/20100110-netflix-map.html

Just wonderful- matching your local zip codes to rentals is a hoot.

-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message