mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Understanding the SVD recommender
Date Thu, 17 Nov 2011 21:36:28 GMT
Adding weights is actually not at all incompatible with the original idea of singular vale
decomposition. It may make some algorithms trickier but weighting is a very reasonable companion
to any approximation algorithm.  

Sent from my iPhone

On Nov 17, 2011, at 12:57, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> Yes. This is even one more step away from straightforward SVD, i.e.
> explicitly analyizing implicit feedback (pun intended).
> 
> On Thu, Nov 17, 2011 at 12:38 PM, Sebastian Schelter <ssc@apache.org> wrote:
>> I think Dmitriys description of the SGD and ALS-WR approach hits the
>> nail on the head.
>> 
>> However there is a third way to factorize the rating matrix which we
>> haven't talked about yet. It's described in Yehuda Koren's
>> "Collaborative Filtering for Implicit Feedback Datasets"
>> http://research.yahoo.com/pub/2433 and I recently added it to
>> ParallelALSFactorizationJob.
>> 
>> This approach works on implicit feedback data (like the number of
>> times a user watched a television series) and all unobserved
>> interactions are by definition 0. Using a standard SVD would result in
>> the problems Dmitriy described.
>> 
>> But the paper introduces a very interesting approach: the user-item
>> matrix holds 0s and 1s only (0 in a cell if there have been no
>> interactions, 1 if there have been 1 or more interactions). This
>> matrix is decomposed into two other matrices X and Y (user and item
>> features) by minimizing the (regularized) squared error over all
>> observations (which is the same as in ALS-WR). However the error is
>> weighted by a confidence value that is very low if the user never
>> interacted with the item (because he simply might not be aware that
>> this item exists) and very high if the user interacted very often with
>> the item (a good indication of preference). That should help to avoid
>> the problems that Dmitriy described.
>> 
>> --sebastian
>> 
>> 
>> 2011/11/17 Dmitriy Lyubimov <dlieu.7@gmail.com>:
>>> On Thu, Nov 17, 2011 at 11:30 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
wrote:
>>>> I will finish adding an option with Cholesky decomposition route to
>>>> SSVD some time early in Q1 2012.
>>>> 
>>> 
>>> PPS i already put some jobs in (they are in the trunk) for Cholesky
>>> route. I thought it would be an easy mod but then i saw that it would
>>> require a little bit more modifications to also support power
>>> iterations the same way they are supported today (and also i still
>>> kind of couldn't quite finish my thought process on what it would take
>>> to modify U-job to produce U without Q in his case, it seems this
>>> route will require a 100% special handling and i wouldn't be able to
>>> reuse any of current U job for this option.
>>> 
>>> For these reasons, i decided to wait until i figure all of the
>>> remaining issues architecturally before i proceed. And that would
>>> better be a one longer chunk of time rather than several little
>>> chunks, which makes it dependent more on my schedule to figure where
>>> that chunk might be.
>>> 
>> 

Mime
View raw message