mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Fold-in for ALSWR
Date Tue, 30 Apr 2013 05:56:27 GMT
ALS-WR is not predicting your input matrix R, but the matrix P which
is R != 0. It is not predicting ratings, but a 0/1 indicator of
whether the connection exists. So the values are usually in [0,1].

On Tue, Apr 30, 2013 at 2:40 AM, Chloe <chloe.guszo@gmail.com> wrote:
> Dear Sean,
>
> Thanks a lot for a quick and helpful reply. Having been sidetracked with
> another project, I revisited the problem I posed in my post over the weekend
> and, unfortunately, have a follow up question.
>
> The problem I'm facing with my implementation of your explanation is that
> the predicted ratings for new users seem to be on a very different scale
> than the original ratings the model is based on and I'm wondering what I've
> done wrong.
>
> To recap my steps in pseudocode:
> 1. Use a text file of ratings on 1-4 scale to generate my model afterward
> given by files U/part-m-00000 and V/part-m-00000, or Ratings = UV'.
>
> 2. Vector newRatings = new Vector(); ex. given 10 items a new user's ratings
> looks like {0,1,0,3,4,0,2,3,0,1}
>     Matrix Au = new DenseMatrix(newRatings.size(), 1);
>     Au.assignColumn(0, newRatings);
>     QRDecomposition qr = new QRDecomposition(V);  //item features
>     Matrix Xu = qr.solve(Au);
>     Matrix predictedUserRatingsForAllItems =
> (Xu.transpose()).times(V.transpose());
>
> 3. DenseVector predictedUserRatingsVector =
> (DenseVector)predictedUserRatingsForAllItems.viewRow(0);
>
> The "predictedUserRatingsVector" from step 3, however, gives a top 10 item
> result with scores ranging from 0.46-0.62. These numbers go up with the
> number of new items rated. Which means that even for item 5, given the
> highest possible score of 4, this approach can't even give back a rating for
> a rated item close to its actual value.
> Moreover, the new user's ratings I test, {0,1,0,3,4,0,2,3,0,1}, are actually
> identical to an existing user that was used to build the model and whose
> predicted ratings are very reasonable, looking like
> {0.5,0.98,0.89,3.23,4.1,1.01,2.32,2.99,3.5,1.1}.
>
> I must be doing something wrong or missing something. Is there anything you
> or anyone from the list with fold-in experience can suggest I try or
> consider that would explain why this is happening? I expected that predicted
> ratings from fold-in would not be as good as regenerating the model, but not
> this bad.
>
> Many thanks,
> Chloe
>
>
>
>

Mime
View raw message