mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiran Kumar Bushireddy <kirankumarsm...@gmail.com>
Subject Re: How to find characteristics of the clusters with mahout?
Date Fri, 10 Aug 2012 06:52:16 GMT
It depends on the important keywords in each document. Documents having
similar keywords will be mapped to the same cluster. It all depends on
distance calculations. Distance from centroid to each document is
calculated and the closest documents to the centroid forms a cluster.
You can evaluate the cluster by giving parameter -e which will give you
intracluster and intercluster density.

Thanks,
Kiran

On Fri, Aug 10, 2012 at 2:30 AM, Necati Demir <ndemir@demir.web.tr> wrote:

> That's right; i want to learn why vectors are being assigned to any
> particular cluster.
> Suppose that each vector represents a person's behaviour. I want to learn
> which behaviour patterns are there in the cluster?
>
> On 10 August 2012 08:06, Paritosh Ranjan <pranjan@xebia.com> wrote:
>
> > I think you want to know why vectors are being assigned to any particular
> > cluster.
> > Different clustering algorithms work in different way, so, I think some
> > code will be needed for it.
> >
> > The way I do it, is by taking a small set of vectors, and debug the
> > clustering algorithm using their sequential version.
> > Its fast and makes things clear.
> >
> > There are certain cluster evaluators also, which might help, but I don't
> > know much about them, try to have a look at them also.
> >
> >
> > On 10-08-2012 02:42, Necati Demir wrote:
> >
> >> Hello,
> >>
> >> I am using mahout 0.8 and after clustering a data, i use this command to
> >> see results:
> >>
> >>  mahout clusterdump --seqFileDir clusters/clusters-77/ --pointsDir
> >>> clusters/clusteredPoints/
> >>>
> >> Also i want to learn why rows are clustered in the same cluster. I
> think,
> >> to learn this i can write code to find which features/dimensions are
> >> similar in a cluster.
> >>
> >> Without writing code, can i find why rows are clustered in the same
> >> cluster?
> >>
> >> **In a nutshell: I want to learn the characteristics of the clusters.**
> >>
> >>
> >>
> >
> >
>
>
> --
> Necati DEMÄ°R
> --------------------
>



-- 
Thanks & Regards,
Kiran Kumar

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message