mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: syntheticcontroldata clustering example failure due to combiner
Date Thu, 11 Jun 2009 17:22:06 GMT
Depending upon the T1 and T2 values you provide, Canopy will produce the 
initial "k" cluster centers for the K-Means step. Those then prime the 
iterations and the result should retain the same number of clusters.

Jeff


Benson Margulies wrote:
> So what are you guys doing to get from an unpredictable number of
> canopies to a 'k' value for k-means and an initial assignment of each
> item to one cluster?
>
>
> On Thu, Jun 11, 2009 at 12:49 PM, Adil Aijaz<adil@yahoo-inc.com> wrote:
>   
>> Jeff,
>>
>> Thanks for the quick turnaround on this issue. Just tested it and the canopy
>> creation and kmeans both work now on syntheticcontroldata. I get 7 canopies
>> and 7 clusters. Collection logic in close() is not pretty but can't think of
>> a workaround myself.
>>
>> adil
>>
>> Jeff Eastman wrote:
>>     
>>> r783617 removed the CanopyCombiner and refactored its semantics back into
>>> the reducer. Updated unit tests pass and Synthetic Control with Canopy
>>> produces 6 clusters. Kmeans also runs produces 6 clusters too. I really
>>> don't like doing stuff in close() but see no practical alternative. Ideas
>>> are still welcomed.
>>>
>>> Jeff
>>>
>>>
>>> Jeff Eastman wrote:
>>>       
>>>> Adil Aijaz wrote:
>>>>         
>>>>> 2. There is a bug in
>>>>> examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/kmeans/Job.java
>>>>> that called runJob from main function with my provided arguments transposed.
>>>>> So, my convergenceDelta was interpreted as t1, t1 as t2, and t2 as
>>>>> convergenceDelta. I will commit a patch as soon as I get approval for
>>>>> opensource commits from my employer, however, I thought I'd put it out
there
>>>>> in case someone else is going through the same issue.
>>>>>
>>>>>           
>>>> r783585 fixed the parameter ordering bug. Still working on the Combiner
>>>> problem.
>>>>
>>>> Thanks Adil,
>>>> Jeff
>>>>
>>>>
>>>>         
>>     
>
>
>   


Mime
View raw message