mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <suneel_mar...@yahoo.com>
Subject Re: K-means: No input clusters found
Date Tue, 24 Dec 2013 20:33:41 GMT
kmeans-init-clusters should be in a file with a name like 'part-xxxx' and not the way you have
it (kmeans-init-clusters).





On Tuesday, December 24, 2013 2:15 PM, Sameer Tilak <sstilak@live.com> wrote:
 
Hi all,

I get the following problem whehn I run k-mens clustering on my real data. Any ehlp with this
would be great!


Here is data that I read out of the  Sequencefile:


022960 value: 022960:{269830:1.0,2042:1.0,145659:1.0,143547:1.0,219265:1.0,321251:1.0,202350:1.0,258610:1.0,239068:1.0,259181:1.0,259177:1.0,33391:1.0,414092:1.0,139519:1.0,428431:1.0,277140:1.0,279116:1.0,426540:1.0,225715:1.0,331909:1.0,347374:1.0,257840:1.0}
022963 value: 022963:{256857:1.0,269830:1.0,2042:1.0,145659:1.0,143547:1.0,219265:1.0,321251:1.0,202350:1.0,258610:1.0,239068:1.0,259181:1.0,259177:1.0,33391:1.0,414092:1.0,139519:1.0,428431:1.0,277140:1.0,279116:1.0,426540:1.0,225715:1.0,438788:1.0,347374:1.0,257840:1.0}
022966 value: 022966:{122295:1.0,143547:1.0,359770:1.0,349739:1.0,279116:1.0,347374:1.0,225715:1.0,295315:1.0,239068:1.0,426540:1.0,25381:1.0,258670:1.0,139519:1.0,140726:1.0,202350:1.0,33391:1.0,80747:1.0,317618:1.0,315249:1.0,219265:1.0,258610:1.0,269830:1.0,446719:1.0,414092:1.0,259177:1.0,15069:1.0,259181:1.0,145659:1.0,257840:1.0,2042:1.0,8916:1.0,349953:1.0}
022968 value: 022968:{382600:1.0,204616:1.0,120442:1.0,213430:1.0,274369:1.0,267345:1.0,350041:1.0,259356:1.0,83126:1.0,270754:1.0,139519:1.0,362853:1.0,279116:1.0}
022969 value: 022969:{270754:1.0,120442:1.0,259356:1.0,139519:1.0,274369:1.0,279116:1.0,236587:1.0,287087:1.0,445965:1.0}
022972 value: 022972:{270695:1.0,382600:1.0,426510:1.0,213430:1.0,274369:1.0,267345:1.0,350041:1.0,259356:1.0,83126:1.0,270754:1.0,63705:1.0,139519:1.0,279116:1.0}

Here is where I write seed clusters to the file. It shows that it wrote 10 clusters.

String KmeansInitClusterFile = "/scratch/kmeans-init-clusters";

SimpleKMeansClustering::generateClusters wrote the following cluster to the file (/scratch/kmeans-init-clusters)
:
CL-0{n=0 c=021105 = [25381:1.000, 139519:1.000, 140726:1.000, 145659:1.000, 239068:1.000,
279116:1.000, 349739:1.000] r=021105 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-1{n=0 c=021111 = [25381:1.000, 139519:1.000, 140726:1.000, 145659:1.000, 239068:1.000,
279116:1.000, 349739:1.000] r=021111 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-2{n=0 c=021117 = [49100:1.000, 120442:1.000, 258280:1.000, 259339:1.000, 259356:1.000,
268294:1.000, 269084:1.000, 270702:1.000, 270754:1.000, 274369:1.000, 274626:1.000] r=021117
=]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-3{n=0 c=021118 = [120442:1.000, 258280:1.000, 259339:1.000, 259356:1.000, 269084:1.000,
270702:1.000, 270754:1.000, 274369:1.000, 274626:1.000] r=021118 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-4{n=0 c=021119 = [426510:1.000] r=021119 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-5{n=0 c=021120 = [9071:1.000, 49100:1.000, 63705:1.000, 120442:1.000, 139519:1.000, 140663:1.000,
145659:1.000, 213430:1.000, 239068:1.000, 251173:1.000, 258280:1.000, 259356:1.000, 267345:1.000,
268294:1.000, 270695:1.000, 276249:1.000, 279116:1.000, 309165:1.000, 350040:1.000, 445676:1.000]
r=021120 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-6{n=0 c=021122 = [6240:1.000, 259356:1.000, 259830:1.000, 270754:1.000, 274369:1.000, 388477:1.000,
426510:1.000] r=021122 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-7{n=0 c=021123 = [49100:1.000, 138703:1.000, 139070:1.000, 139519:1.000, 259356:1.000,
268294:1.000, 270695:1.000, 277065:1.000, 279116:1.000, 309165:1.000, 445834:1.000] r=021123
=]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-8{n=0 c=021124 = [1667:1.000, 9071:1.000, 15397:1.000, 29237:1.000, 49100:1.000, 63705:1.000,
138703:1.000, 139070:1.000, 139519:1.000, 140663:1.000, 213430:1.000, 238903:1.000, 259356:1.000,
260088:1.000, 267345:1.000, 268294:1.000, 270695:1.000, 270754:1.000, 274347:1.000, 276249:1.000,
279116:1.000, 291707:1.000, 295315:1.000, 309165:1.000, 313307:1.000, 317618:1.000, 320741:1.000,
349953:1.000, 350040:1.000, 387714:1.000, 445676:1.000] r=021124 =]}
SimpleKMeansClustering::generateClusters wrote the following cluster to the file:
CL-9{n=0 c=021125 = [49100:1.000, 139519:1.000, 268294:1.000, 279116:1.000, 384009:1.000]
r=021125 =]}

I use the following method in my class to perform k-means:

KMeansDriver.run(this.conf, new Path(SparceVectorizedCidFile), new Path(KmeansInitClusterFile),
                             new Path(KmeansClustersResultsFile), new EuclideanDistanceMeasure(),
0.001, 5,
                             true, 1.0, false);

13/12/24 11:09:27 INFO kmeans.KMeansDriver: Input: /scratch/SparceVectorizedConceptIds Clusters
In: /scratch/kmeans-init-clusters Out: /scratch/KmeansClustersResultsFile Distance: org.apache.mahout.common.distance.EuclideanDistanceMeasure
13/12/24 11:09:27 INFO kmeans.KMeansDriver: convergence: 0.001 max Iterations: 5
java.lang.IllegalStateException: No input clusters found in /scratch/kmeans-init-clusters.
Check your -c argument.
    at org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:212)
    at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:143)
    at myanalytics.SimpleKMeansClustering.runKmeansDriver(SimpleKMeansClustering.java:209)
    at myanalytics.SimpleKMeansClustering.main(SimpleKMeansClustering.java:269)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
-bash-4.1$ hadoop dfs -ls /scratch/kmeans-init-clusters
Warning: $HADOOP_HOME is deprecated.

Found 1 items
-rw-r--r--   1 userid supergroup       2850 2013-12-24 11:09 /scratch/kmeans-init-clusters
-bash-4.1$
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message