mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Areas needing help
Date Mon, 14 Sep 2009 16:32:42 GMT

On Sep 14, 2009, at 12:23 PM, Tanton Gibbs wrote:

> Hi,
> I'd like to start working more with the mahout code, making small
> improvements here and there.  I want to primarily focus on performance
> improvements and unit testing (mainly because I enjoy doing that).
> However, I'd like to improve a place that needs improvement.  If you
> know of a section of code that you would like to see refactored/sped
> up/tested could you please send it to the list or to me?  Or, if there
> is a wiki page on this, please point me to it and accept my apologies.

Testing and profiling of the clustering, classification and collab  
filtering code would be very welcome.   There are several open issues  
in JIRA related to these (MAHOUT-165 comes to mind).

I think just running some examples at scale and reporting back results  
would be great as well.  You can also start by looking at

One idea is to take the Wikipedia examples I put up at

  (I will donate the code soon) and try running them at larger scale  
for Wikipedia.

View raw message