mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandra Mohan, Ananda Vel Murugan" <Ananda.Muru...@honeywell.com>
Subject RE: K Mean Clustering on Two columns`
Date Tue, 18 Jun 2013 10:41:54 GMT
Hi, 

I implemented something similar in the following way. 

Created a class which implements org.apache.commons.math3.ml.clustering.Clusterable with just
two member variables double[] point and long id and geter/setter function. 

Iterated through the data and created instances of this class. Added them to a list

Then instantiated KMeansPlusPlusClusterer as below

org.apache.commons.math3.ml.clustering.KMeansPlusPlusClusterer<CustomPoint> clusterer
= new KMeansPlusPlusClusterer<CustomPoint>(4,100,new org.apache.commons.math3.ml.distance.CanberraDistance());

Then called KMeansPlusPlusClusterer.clusterer as follows

List<CentroidCluster<CustomPoint>> clusterList = clusterer.cluster(points);

I was able to get the clusters in this way. Don't know whether this is the right approach.
But it worked for me. 

Regards,
Anand.C

-----Original Message-----
From: syed kather [mailto:in.abdul@gmail.com] 
Sent: Tuesday, June 18, 2013 3:23 PM
To: user@mahout.apache.org
Subject: K Mean Clustering on Two columns`

Hi Team
   How to do the K Mean Clustering on 2 selected Columns



Line No,age,income,sex,city
1,22,1500,1,xxx,
2,54,13450,2,yyy
-
-
-
-
-

Like this Input Goes . But i need to do Clustering on Columns 2 and 3


How to do that ?

I had tried using synthatic kmean Means But i am not able to extract the
cluster ID with corresponding to Line No.

Please help me


Thanks and regards
Syed Abdul Kather



            Thanks and Regards,
        S SYED ABDUL KATHER

Mime
View raw message