mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Problems running examples
Date Sun, 05 Jun 2011 18:07:01 GMT
Hi all. I'm trying to run the examples/bin/build-reuters.sh but I 
continue to run into the following exception.

INFO: Deleting mahout-work/reuters-kmeans-clusters
Jun 5, 2011 10:29:37 AM org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using 
builtin-java classes where applicable
Jun 5, 2011 10:29:37 AM org.apache.hadoop.io.compress.CodecPool 
getCompressor
INFO: Got brand-new compressor
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 
0, Size: 0
     at java.util.ArrayList.RangeCheck(ArrayList.java:547)
     at java.util.ArrayList.get(ArrayList.java:322)
     at 
org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:108)
     at 
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:101)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at 
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:58)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)

I am also confused reading the build-reuters.sh code itself. There seems 
to be some disjunction between what is expected to be local and what 
should be on HDFS. For example on the comments on 77-79 are:

# we know reuters-out-seqdir exists on a local disk at
# this point, if we're running in clustered mode,
# copy it up to hdfs

However upon inspection you'll notice that the reueters-out-seqdir is 
actually on HDFS.  It seems like the seqdirectory will never write to 
local disk... even with the MAHOUT_LOCAL=true flag set.

Any ideas?

Thanks

Mime
View raw message