mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruv <>
Subject Re: Need Help for performing Text Mining using Mahout
Date Fri, 26 Aug 2011 15:44:14 GMT
Atul, welcome to Mahout!

Although there are many interesting things you can do with your data, I would recommend using
k-means clustering to get a feel for Mahout's input mechanism, sequence files etc.

You can find detailed explanation of clustering 
on our wiki. The Mahout in Action book is also a good resource.

On Aug 26, 2011, at 8:05 AM, Atul Aggarwal <> wrote:

> Hi,
> I am working on a text mining of huge data. I have big set of strings
> (separated by a new line character), on which I want to run a algorithm
> which can give me similarity distances between the string. Further, I want
> to use that distance to group those strings based on their similarities.
> Now, I am new to mahout, but I also believe that for the size of data I have
> mahout can be good option. I am wondering if anyone can guide me how should
> I proceed with this problem.
> Thanks for your help!!
> Regards,
> Atul Aggarwal

View raw message