mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Questions about PearsonCorrelation on a example
Date Wed, 24 Jun 2009 00:12:52 GMT
Multinomial likelihood ratios can handle any size contingency table.  I
haven't used them for this, though.

Of course, it is commonly true that ratings break down as 80+% very
positive, ~10% very negative and ~10% intermediate values.  To my mind, this
is just as well summarized as negative, positive or no strong value.
Furthermore, there is very little loss in forgetting the negative ratings
because it is so hard to interpret them well (a negative rating often means
"this is *exactly* what I wanted except for some tiny nit that drives me
completely non-linear").  There is a long tradition going back to Shardanad
of using multiple levels of scoring in collaborative filtering, but there is
little evidence that it is useful.

Even more of a problem, though, is the fact that only a few percent of the
users ever rate anything.  That makes implicit observations much more useful
for most recommendation tasks.

On Tue, Jun 23, 2009 at 4:20 PM, Sean Owen <> wrote:

> Do any of the approaches you cite take into the account the value of
> the rating itself? I agree, seems like there should be some
> alternative to Pearson / cosine-measure to offer, but right now it's
> the only similarity metric that cares about the rating.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message