mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmad Ammari <ammari...@gmail.com>
Subject Re: NewsKMeansClustering does not find any clusters!
Date Thu, 17 Nov 2011 17:36:32 GMT
Hi Jeff,

Can you please elaborate what is meant by the -c path? I am running the
Class NewsKMeansClustering normally from NetBeans (not from a command-line
shell neither from mahout launcher script). So, I am not including any
options with the run.

Thanks,
Ahmad

On Wed, Nov 16, 2011 at 5:22 PM, Jeff Eastman <jeastman@narus.com> wrote:

> K-means is attempting to load your initial clusters and is not finding
> any. Have you checked your -c path? You can also add -xm sequential so you
> can run the sequential algorithm. This allows you to use a debugger to
> verify your paths.
>
> -----Original Message-----
> From: Ahmad Ammari [mailto:ammariect@gmail.com]
> Sent: Wednesday, November 16, 2011 7:19 AM
> To: user@mahout.apache.org
> Subject: NewsKMeansClustering does not find any clusters!
>
> Hello,
>
> I am practicing the mahout examples in the clustering part of the book
> "Mahout in action", particularly chapter 9. In Section 9.1.4, I am trying
> to run the class NewsKMeansClustering, which I got its source code from the
> companion source code files. What I understood is that the input directory
> "inputDir" should contain the input documents in SequenceFile format.
> Therefore, I tried to make the "reuters-seqfiles" directory that we
> generated using the seqdirectory program that runs in the mahout launcher
> in chapter 8 (page 139). I then ran the NewsKMeansClustering, which started
> to run fine, until I get a java.lang.IllegalStateException exception,
> saying that No clusters found, as follows:
>
> java.lang.IllegalStateException: No clusters found. Check your -c path.
> at
>
> org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
> INFO: map 0% reduce 0%
> 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
> INFO: Job complete: job_local_0010
> 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.Counters log
> INFO: Counters: 0
> Exception in thread "main" java.lang.InterruptedException: K-Means
> Iteration failed processing reutersClusters/canopy-centroids/clusters-0
> at
>
> org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:363)
> at
>
> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:310)
> at
>
> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:237)
> at
> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:152)
> at clusterer.NewsKMeansClustering.main(NewsKMeansClustering.java:81)
> ------------------------------------------------------------------------
> BUILD FAILURE
> ------------------------------------------------------------------------
> Total time: 15.391s
> Finished at: Wed Nov 16 00:49:14 GMT 2011
> Final Memory: 10M/150M
> ------------------------------------------------------------------------
> Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec
> (default-cli) on project mahout-examples: Command execution failed. Process
> exited with an error: 1(Exit value: 1) -> [Help 1]
>
> To see the full stack trace of the errors, re-run Maven with the -e switch.
> Re-run Maven using the -X switch to enable full debug logging.
>
> For more information about the errors and possible solutions, please read
> the following articles:
> [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
>
> What does it mean that no cluster found?!
>
> Is the input directory wrong? If so, what input should I give the class?
>
> I tried to change the canopy thresholds (250, 120) to some other numbers,
> tried also changing the EuclideanDistanceMeasure for the canopy clustering
> to CosineDistanceMeasure, with no use.
>
> Many thanks in advance,
> Ahmad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message