mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paritosh Ranjan <pran...@xebia.com>
Subject Re: empty vector out of clusterdump
Date Tue, 20 Mar 2012 17:13:51 GMT
Can you try cluster output post processor once?

You will find the documentation of how to use it here
https://cwiki.apache.org/MAHOUT/top-down-clustering.html

If you get empty vectors with clusterpp also, then the problem is in the 
clustering step somewhere, else there is some problem in cluster dumper.
It will at least help figure out the problem area.

On 20-03-2012 17:40, Baoqiang wrote:
> Yes, I used -cl in kmeans step. It is that the biggest cluster is empty, all others are
not empty. I don't know why.
>
> Sent from my iPhone
>
> On Mar 20, 2012, at 1:36 AM, Paritosh Ranjan<pranjan@xebia.com>  wrote:
>
>> Did you run kmeans with -cl<run input vector clustering>   option set to "true"?
>>
>>
>> On 19-03-2012 07:38, Baoqiang Cao wrote:
>>> Hi,
>>>
>>> I used mahout kmeans and then clusterdump. The biggest cluster (number
>>> of members is 844992), here is the result:
>>>
>>> VL-1705919{n=844992 c=[] r=[]}
>>>          Top Terms:
>>>          Weight : [props - optional]:  Point:
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>          1.0 : [distance=0.0]: []
>>>
>>> What does this mean? This whole cluster is made of empty vectors(members)?
>>>
>>> Best,
>>> Baoqiang


Mime
View raw message