mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Fuzzy K Means
Date Thu, 18 Feb 2010 12:18:23 GMT
Very similar, especially when you consider that k-means only adds the 
whole point value to the single, closest cluster (i.e. 
weightedPointTotal += 1), whereas fuzzy adds it partially to all. I 
don't think the other clustering routines require/expect numPoints to be 
an integer and the instvar could probably be generalized to double 
weightedPointTotal without impact.

Perhaps better to consider that change separately, as there are a number 
of tests which compare getNumPoints() with an integer value and would 
have to be adjusted. Likely it would be just adding an (int) cast as the 
values in non-fuzzy tests would always be whole numbers.


Pallavi Palleti wrote:
> Yes. But not the total number of points. So, the numpoints from 
> ClusterBase will not be used in SoftCluster. numpoints is specific to 
> Kmeans similar to weightedpoint total for fuzzy kmeans.
>
> Robin Anil wrote:
>> the center is still the averaged out centroid right?
>> weightedtotalvector/totalprobWeight
>>
>>
>>
>> On Wed, Feb 17, 2010 at 5:10 PM, Pallavi Palleti <
>> pallavi.palleti@corp.aol.com> wrote:
>>
>>  
>>> I haven't yet gone thru ClusterDumper. However, ClusterBase would be 
>>> having
>>> number of points to average out (pointTotal/numPoints as per kmeans) 
>>> where
>>> as SoftCluster will have weighted point total. So, I am wondering 
>>> how can we
>>> reuse ClusterBase here?
>>>
>>>
>>> Thanks
>>> Pallavi
>>>
>>> Robin Anil wrote:
>>>
>>>    
>>>> yes. So that cluster dumper can print it out.
>>>>
>>>> On Wed, Feb 17, 2010 at 5:02 PM, Pallavi Palleti <
>>>> pallavi.palleti@corp.aol.com> wrote:
>>>>
>>>>
>>>>
>>>>      
>>>>> Hi Robin,
>>>>>
>>>>> when you meant by reusing ClusterBase, are you planning to extend
>>>>> ClusterBase in SoftCluster? For example, SoftCluster extends 
>>>>> ClusterBase?
>>>>>
>>>>> Thanks
>>>>> Pallavi
>>>>>
>>>>>
>>>>> Robin Anil wrote:
>>>>>
>>>>>
>>>>>
>>>>>        
>>>>>> I have been trying to convert FuzzyKMeans SoftCluster(which 
>>>>>> should be
>>>>>> ideally be named FuzzyKmeansCluster) to use the ClusterBase.
>>>>>>
>>>>>> I am getting* the same center* for all the clusters. To aid the
>>>>>> conversion
>>>>>> all i did was remove the center vector from the SoftCluster class

>>>>>> and
>>>>>> reuse
>>>>>> the same from the ClusterBase. These are essentially making no 
>>>>>> change in
>>>>>> the
>>>>>> tests which passes correctly.
>>>>>>
>>>>>> So I am questioning whether the implementation keeps the average

>>>>>> center
>>>>>> at
>>>>>> all ? Anyone who has used FuzzyKMeans experiencing this?
>>>>>>
>>>>>>
>>>>>> Robin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>           
>>>>       
>>
>>   
>


Mime
View raw message