mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Francisco Contreras Gaitan <juanfcocontre...@hotmail.com>
Subject RE: String clustering and other newbie questions
Date Tue, 01 Sep 2009 13:28:13 GMT

Well, I have reread Ted answer after having a look at some of the information Isabel gave
me, and I think you are right. But I am not sure about the reason  k-means mahout algorithm
cannot be used with strings, after defining a string distance metric. Taking Jeff's advice,
I could use a Map between doubles and strings: storaging doubles in all the algorithm, and
retrieving the strings to compute distance in measuring steps. Could it make any sense?

Regards,
jfcg

> Subject: Re: String clustering and other newbie questions
> From: gsingers@apache.org
> Date: Tue, 1 Sep 2009 05:33:34 -0700
> To: mahout-user@lucene.apache.org
> 
> 
> On Sep 1, 2009, at 5:06 AM, Juan Francisco Contreras Gaitan wrote:
> 
> >
> > Ok, I see. Sorry for my unknowledge on these matters (I am going to  
> > read all the documentation you gave me closely).
> >
> > But if I understood you well, and as far as I know, Mahout has its  
> > own k-means implementation. Then, could I use it for my purposes  
> > instead of DP like setup?
> 
> I think Ted was saying that DP is the only one that would work for  
> what you described, but it's also possible we aren't understanding the  
> problem right either.
> 
> Obviously, one of the things we as a project need to develop more is  
> guidelines on which approaches work for which types of problems..
> 
> -Grant

_________________________________________________________________
Con Vodafone disfruta de Hotmail gratis en tu móvil. ¡Pruébalo!
http://serviciosmoviles.es.msn.com/hotmail/vodafone.aspx
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message