mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Why there is "Infinity" values for the vector of a K-Means cluster center point?
Date Fri, 16 Mar 2012 16:16:55 GMT
Good question. The only way I can think of an infinity in a Kluster
center is if there were some infinity values in the vectors it observed.
The center (centroid) is calculated in each iteration after all points
have been observed by dividing S1 by S0. If, for some reason, S0 was
zero this would cause all of the center elements to be infinity. We
check for that case so it is unlikely.

Can you narrow it down a bit more? How are you getting the kmeans prior?
By sampling input vectors (-k) or using Canopy? Are there any infinity
values in clusters-0?


On 3/15/12 10:11 PM, WangRamon wrote:
>
>
>
>
>
> Hi Guys
>
>  
>
> I’m running k-Means driver, I find most (95%)
> of my input vectors is built into one cluster, so I retrieved the big Cluster
> object and one of its point that belong to it, use CosineDistanceMeasure to calculate
> the distance between them, I found the distance is “NaN”, so I tried to debug,
> and found the center point vector of the Cluster object contains some “-Infinity”
> values, so when the measure method is called, the AbstractVector.dot will
> return “Infinity”, so I don’t know if this is the reason that caused most of my
> input vectors belong to one big cluster? And why there are “-Infinity” values
> in the center point? Thanks in advance. BTW, i'm suing Mahout 0.6 release.
>
>  
>
> Cheers
>
> Ramon
>
>  		 	   		  


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message