mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Filimon <>
Subject Re: Log-likelihood ratio test as a probability
Date Fri, 21 Jun 2013 09:13:17 GMT
The thing is there's no real model for which these are features.
I'm looking for pairs of similar items (and eventually groups). I'd like a
probabilistic interpretation of how similar two items are. Something like
"what is the probability that a user that likes one will also like the

Then, with these probabilities per day, I'd combine them over the course of
multiple days by "pulling" the older probabilities towards 0.5: alpha * 0.5
+ (1 - alpha) * p would be the linear approach to combining this where
alpha is 0 for the most recent day and larger for older ones. Then, I'd
take the average of those estimates.
The result would in my mind be a "smoothed" probability.

Then, I'd get the top k per item from these.

On Fri, Jun 21, 2013 at 11:45 AM, Ted Dunning <> wrote:

> On Fri, Jun 21, 2013 at 8:25 AM, Dan Filimon <
> >wrote:
> > Thanks for the reference! I'll take a look at chapter 7, but let me first
> > describe what I'm trying to achieve.
> >
> > I'm trying to identify interesting pairs, the anomalous co-occurrences
> with
> > the LLR. I'm doing this for a day's data and I want to keep the p-values.
> > I then want to use the p-values to compute some overall probability over
> > the course of multiple days to increase confidence in what I think are
> the
> > interesting pairs.
> >
> You can't reliably combine p-values this way (repeated comparisons and all
> that).
> Also, in practice if you take the top 50-100 indicators of this sort the
> p-values will be so astronomically small that frequentist tests of
> significance are ludicrous.
> That said, the assumptions underlying the tests are really a much bigger
> problem.  The interesting problems of the world are often highly
> non-stationary which can lead to all kinds of problems in interpreting
> these results.  What does it mean if something shows a 10^-20 p value one
> day and a 0.2 value the next? Are you going to multiply them?  Or just say
> that something isn't quite the same?  But how do you avoid comparing
> p-values in this case which is a famously bad practice.
> To my mind, the real problem here is that we are simply asking the wrong
> question.  We shouldn't be asking about individual features.  We should be
> asking about overall model performance.  You *can* measure real-world
> performance and you *can* put error bars around that performance and you
> *can* see changes and degradation in that performance.  All of those
> comparisons are well-founded and work great.  Whether the model has
> selected too many or too few variables really is a diagnostic matter that
> has little to do with answering the question of whether the model is
> working well.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message