mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Mahout > k-means-commandline
Date Mon, 16 Apr 2012 13:28:00 GMT
Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT)
Page: k-means-commandline (https://cwiki.apache.org/confluence/display/MAHOUT/k-means-commandline)
Comment: https://cwiki.apache.org/confluence/display/MAHOUT/k-means-commandline?focusedCommentId=27844105#comment-27844105

Comment added by Jeff Eastman:
---------------------------------------------------------------------

The line: "hdfs://RH01:9000/user/hadoop/testdata/synthetic_control.data not a SequenceFile"
in your transcript output indicates you are attempting to run k-means on the synthetic control
data file, which is a text file. If you look at the synthetic control examples, you will note
that they call

   InputDriver.runJob(input, directoryContainingConvertedInput,
        "org.apache.mahout.math.RandomAccessSparseVector");

on this file before invoking k-means on its sequence file output.

In reply to a comment by yexq:
[hadoop@RH01 ~]$ mahout kmeans -i testdata -o output -c clusters -dm org.apache.mahout.common.distance.CosineDistanceMeasure
-x 5 -ow -cd 1 -k 25
Running on hadoop, using HADOOP_HOME=/mnt/userspace/hadoop-0.20.2
HADOOP_CONF_DIR=/mnt/userspace/hadoop-0.20.2/conf
12/04/16 12:51:48 INFO common.AbstractJob: Command line arguments: {--clusters=clusters, --convergenceDelta=1,
--distanceMeasure=org.apache.mahout.common.distance.CosineDistanceMeasure, --endPhase=2147483647,
--input=testdata, --maxIter=5, --method=mapreduce, --numClusters=25, --output=output, --overwrite=null,
--startPhase=0, --tempDir=temp}
12/04/16 12:51:49 INFO common.HadoopUtil: Deleting output
12/04/16 12:51:49 INFO common.HadoopUtil: Deleting clusters
12/04/16 12:51:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/04/16 12:51:49 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib
library
12/04/16 12:51:49 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.lang.IllegalStateException: java.io.IOException: hdfs://RH01:9000/user/hadoop/testdata/synthetic_control.data
not a SequenceFile
	at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:63)
	at org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:87)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:101)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:58)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: hdfs://RH01:9000/user/hadoop/testdata/synthetic_control.data
not a SequenceFile
	at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1455)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
	at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58)
	at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterable.iterator(SequenceFileIterable.java:61)
	... 16 more
who can help me?

Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action

Mime
View raw message