mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <bimargul...@gmail.com>
Subject Re: Validating clustering output
Date Tue, 28 Jul 2009 18:49:22 GMT
On Tue, Jul 28, 2009 at 6:55 AM, Grant Ingersoll<gsingers@apache.org> wrote:
>
> On Jul 28, 2009, at 12:48 AM, Ted Dunning wrote:
>
>>
>> I owe the IBM team my interest in statistical approaches to AI and
>> symbolic
>> sequences.  It was on a visit to IBM in 1990 or so that Stephen (or
>> Vincent)
>> dP mentioned off-handedly to me that mutual information was "trivially
>> known
>> to be chi-squared distributed asymptotically".
>
> I love statements like these!  Takes me back to the good old Math days of
> "We'll leave it as an exercise to the reader" or proofs that start off by
> saying "It is trivial to prove ..., so we'll proceed to the main part of the
> proof" and, as a 20 year old Math student you spend the next day beating
> your head against the wall because it is anything but trivial to you!

And, indeed, the paper that started this thread is a shining example
of that sort of thing from the point of view of actual programming.
The 'description' of how to get from the O(5) obvious to something
usable is largely notable for what it does not say.

>
> -Grant
>

Mime
View raw message