mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sharath jagannath <>
Subject Re: Clustering with KMeans
Date Tue, 08 Feb 2011 20:27:59 GMT
Dear Kate,

These are the set of commands that would be very similar to what my program
is doing. Even with these commands I got only one cluster.

1. Generate tf-idf vectors using the code samples mentioned previously
together with Seq2Sparse Command

2. ../bin/mahout canopy -t1 3 -t2 2.5  -i sj/output/tfidf-vectors -o
sj/canopy/output/ -ow

3. ../bin/mahout kmeans -i sj/output/tfidf-vectors -o sj/kmeans/output -x 10
-cd 0.001 -ow -c sj/canopy/output/clusters-0

4. ../bin/mahout clusterdump -s sj/kmeans/output/clusters-1/

>From this, I guess my distance thresholds and convergence delta needs to be
tuned. I am not sure about it though.
Being playing around with those values without any improvement.
I am pretty new at this and not able to conclude what is going wrong.

Thanks alot and appreciate your response.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message