mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Filimon <dangeorge.fili...@gmail.com>
Subject Re: How to improve clustering?
Date Tue, 26 Mar 2013 16:33:12 GMT
Hi,

Could you tell us more about the kind of data you're clustering? What
distance measure you're using and what the dimensionality of the data
is?

On Tue, Mar 26, 2013 at 6:21 PM, Sebastian Briesemeister
<sebastian.briesemeister@unister-gmbh.de> wrote:
> Dear Mahout-users,
>
> I am facing two problems when I am clustering instances with Fuzzy c
> Means clustering (cosine distance, random initial clustering):
>
> 1.) I always end up with one large set of rubbish instances. All of them
> have uniform cluster probability distribution and are, hence, in the
> exact middle of the cluster space.
> The cosine distance between instances within this cluster reaches from 0
> to 1.
>
> 2.) Some of my clusters have the same or a very very similar center.
>
> Besides the above described problems, the clustering seems to work fine.
>
> Has somebody an idea how my clustering can be improved?
>
> Regards
> Sebastian

Mime
View raw message