mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Mummulla (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-1177) GSOC 2013: Reform and simplify the clustering APIs
Date Thu, 23 May 2013 05:23:22 GMT


Ravi Mummulla commented on MAHOUT-1177:

Hi Folks,
I am new to this project, but have experience with Hadoop. I work in the Seattle are in the
Big Data space and I am also working on my second graduate degree (in Math and Stat.) My intent
is not GSoC participation, I just want to contribute to the Mahout project. Please let me
know how I can help.

> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>                 Key: MAHOUT-1177
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Dan Filimon
>              Labels: gsoc2013, mentor
> Clustering is one of the most used features in Mahout and has many applications [].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they all work in
the same way have consistent inputs, outputs, diagnostics and documentation.
> This is a great way to gain an in-depth understanding of clustering algorithms, familiarize
yourself with Hadoop, Mahout clustering and good software engineering principles.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message