mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-1045) Cluster evaluators returning bad results
Date Sun, 19 Aug 2012 18:39:37 GMT


Andrew Purtell commented on MAHOUT-1045:

The integration tests checked in with this patch appear to want an absolute path hardcoded
to the developer's home directory:

mvn -Dhadoop.version=2.0.0-alpha clean install
Tests in error: 
  testClusterEvaluator(org.apache.mahout.clustering.MAHOUT1045Test): /Users/jeff/Desktop/jeff/kmeans-clusters/clusters-27-final
  testCDbwEvaluator(org.apache.mahout.clustering.MAHOUT1045Test): /Users/jeff/Desktop/jeff/kmeans-clusters/clusters-27-final

> Cluster evaluators returning bad results
> ----------------------------------------
>                 Key: MAHOUT-1045
>                 URL:
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.6, 0.7, 0.8
>         Environment: Several environments and data sets
>            Reporter: Pat Ferrel
>             Fix For: 0.8
>         Attachments: first-time-density-nan.txt, MAHOUT-1045.patch, MAHOUT-1045.patch,
MAHOUT-1045.patch, MAHOUT-1045.patch
> With real world crawl data the Intra-cluster density from ClusterEvaluator is almost
always NaN. The CDbw inter-cluster density is almost always 0. I have also seen several cases
where CDbw fails to return any results but have not tracked down why yet.
> I have sent a link to an 8G data set that reproduces these errors to Jeff Eastman.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message