mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: kMeans Help
Date Mon, 29 Jun 2009 01:53:14 GMT
Hard to dispute that!

This definitely does not sound like a theory problem so much as simple
implementation woes.

On Sun, Jun 28, 2009 at 2:55 PM, Grant Ingersoll <>wrote:

> On Jun 28, 2009, at 4:56 PM, Grant Ingersoll wrote:
>  I get all of this, my point is that when you rehydrate the Cluster, it
>> doesn't properly report the centroid per my email all because numPoints == 0
>> and pointTotal is a a vector that is the same as the passed in center
>> vector, but initialized to 0.
> In other words, the simple act of serializing a Cluster to HDFS and then
> reconstituting it should not alter the result one gets, which I believe is
> what happens if one dumps out the clusters that have been calculated after
> the whole process is done.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message