spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manish Tripathi <tr.man...@gmail.com>
Subject Re: Negative values of predictions in ALS.tranform
Date Thu, 15 Dec 2016 23:30:47 GMT
Ok. So we can kind of interpret the output as probabilities even though it
is not modeling probabilities. This is to be able to use it for
binaryclassification evaluator.

So the way I understand is and as per the algo, the predicted matrix is
basically a dot product of user factor and item factor matrix.

but in what circumstances the ratings predicted can be negative. I can
understand if the individual user factor vector and item factor vector is
having negative factor terms, then it can be negative. But practically does
negative make any sense? AS per algorithm the dot product is the predicted
rating. So rating shouldnt be negative for it to make any sense. Also
rating just between 0-1 is normalised rating? Typically rating we expect to
be like any real value 2.3,4.5 etc.

Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
feed the count matrix (discrete count values) and am assuming spark
internally converts it into a preference matrix (1/0) and a confidence
matrix =1+alpha*count_matrix




ᐧ

On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen <sowen@cloudera.com> wrote:

> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi <tr.manish@gmail.com>
> wrote:
>
>> Hi
>>
>> ran the ALS model for implicit feedback thing. Then I used the .transform
>> method of the model to predict the ratings for the original dataset. My
>> dataset is of the form (user,item,rating)
>>
>> I see something like below:
>>
>> predictions.show(5,truncate=False)
>>
>>
>> Why is the last prediction value negative ?. Isn't the transform method
>> giving the prediction(probability) of seeing the rating as 1?. I had counts
>> data for rating (implicit feedback) and for validation dataset I binarized
>> the rating (1 if >0 else 0). My training data has rating positive (it's
>> basically the count of views to a video).
>>
>> I used following to train:
>>
>> * als = ALS(rank=x, maxIter=15, regParam=y,
>> implicitPrefs=True,alpha=40.0)*
>>
>> *                model=als.fit(self.train)*
>>
>> What does negative prediction mean here and is it ok to have that?
>> ᐧ
>>
>

Mime
View raw message