mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: kMeans Help
Date Sat, 27 Jun 2009 00:45:22 GMT
Still no dice.

On Jun 26, 2009, at 7:59 PM, Grant Ingersoll wrote:

> We need to make that handled separately then from the various jobs.   
> That was one of the things that was different about the KMeansJob  
> call.
>
> On Jun 26, 2009, at 7:45 PM, Jeff Eastman wrote:
>
>> Found the call in the syntheticcontrol/kmeans.Job had true for the  
>> overwrite output flag. Don't think that was your problem, but  
>> something similar must be at work.
>>
>>
>>
>> Jeff Eastman wrote:
>>> Running the latest trunk, I get a file not found exception running  
>>> synthetic control on the $output/data file. Looks like output got  
>>> deleted somewhere but have not discovered where yet. Perhaps  
>>> Canopy is broken or KMeans is purging output?
>>>
>>>
>>> Grant Ingersoll wrote:
>>>> I'm running trunk.  Using the data at http://people.apache.org/wikipedia/n2.tar.gz

>>>>  (a dump of 2302 documents from a Lucene index of Wikipedia.  The  
>>>> chunks file in that same directory contains the original files).   
>>>> Vectors are normalized using L2.
>>>>
>>>> When I run K-Means on it via:  
>>>> org.apache.mahout.clustering.kmeans.KMeansDriver --input /Users/ 
>>>> grantingersoll/projects/lucene/solr/wikipedia/devWorks/n2/part- 
>>>> full.txt --clusters /Users/grantingersoll/projects/lucene/solr/ 
>>>> wikipedia/devWorks/n2/clusters --k 10 --output /Users/ 
>>>> grantingersoll/projects/lucene/solr/wikipedia/devWorks/n2/k- 
>>>> output --distance org.apache.mahout.utils.CosineDistanceMeasure
>>>>
>>>> I get the two directories seen in n2-output.  The clusters-0 and  
>>>> clusters-1 files both contain a single vector which is all 0.
>>>>
>>>> I've also tried SquaredEuclidean, but to no avail.
>>>>
>>>> Any insight into what I'm doing wrong would be appreciated.
>>>>
>>>> Thanks,
>>>> Grant
>>>>
>>>>
>>>
>>>
>>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Mime
View raw message