mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nimrod Priell <nimrod.pri...@gmail.com>
Subject Re: "Direction" of co-occurence and log-likelihood ratio
Date Fri, 22 Jun 2012 11:40:03 GMT
Thanks, for both replies. Helped me a lot.

On Jun 21, 2012, at 11:20 PM, Ted Dunning wrote:

> Most correlation measures have trouble with small counts. They ascribe very high score
to coincidence (hence the title of the original paper)
> 
> Sent from my iPhone
> 
> On Jun 21, 2012, at 2:01 PM, Nimrod Priell <nimrod.priell@gmail.com> wrote:
> 
>> 
>> I did note Lingpipe uses a different type of scoring, Pearson C_2 goodness of fit
(it seems different from LLR, but I didn't dig deep) to do their collocation scoring: http://alias-i.com/lingpipe/demos/tutorial/interestingPhrases/read-me.html
(the exact method is documented in the code, http://alias-i.com/lingpipe/docs/api/com/aliasi/lm/TokenizedLM.html#chiSquaredIndependence(int[])
). Is that method a good way to capture what I'd like?


Mime
View raw message