mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Questions about PearsonCorrelation on a example
Date Wed, 24 Jun 2009 01:01:30 GMT
I will have to read up on multinomial likelihood. I don't see how
mutual information is applied to this problem? trying to figure out
what the random variables are...

I think I have reached the same conclusion that rating data is
typically noisy enough to make it hard to use. Agree about implicit
observations too.

On Tue, Jun 23, 2009 at 8:12 PM, Ted Dunning<> wrote:
> Multinomial likelihood ratios can handle any size contingency table.  I
> haven't used them for this, though.
> Of course, it is commonly true that ratings break down as 80+% very
> positive, ~10% very negative and ~10% intermediate values.  To my mind, this
> is just as well summarized as negative, positive or no strong value.
> Furthermore, there is very little loss in forgetting the negative ratings
> because it is so hard to interpret them well (a negative rating often means
> "this is *exactly* what I wanted except for some tiny nit that drives me
> completely non-linear").  There is a long tradition going back to Shardanad
> of using multiple levels of scoring in collaborative filtering, but there is
> little evidence that it is useful.
> Even more of a problem, though, is the fact that only a few percent of the
> users ever rate anything.  That makes implicit observations much more useful
> for most recommendation tasks.

View raw message