mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tanton Gibbs <>
Subject Re: Areas needing help
Date Mon, 14 Sep 2009 17:25:20 GMT
If the CF unit tests have been started, then fleshing them out sounds
like a good place to start.  I've not used EMMA, so that sounds like

Thanks for the pointers!

On Mon, Sep 14, 2009 at 10:12 AM, Sean Owen <> wrote:
> FWIW I have been profiling the CF code a lot lately so think it has had a
> recent, close look. I might turn profilers to areas that haven't had as
> close a look.
> The CF unit tests are fine if lagging in completeness. I think it would be
> perhaps relatively simpler to locate some untested bits and add a test or
> two - if you can run EMMA on the current tests you will quickly find a few
> gaps.
> I think as a result of trying to test some code you will locate some useful
> ways to refactor and structure the code. That sort of thing strikes me as
> important now. Best to make some big code moves early while the API is
> understood to be very in flux, and to build a solid foundation.
> For instance this week I would like to continue by proposing to merge and
> reshuffle the utils and common packages.
> On Sep 14, 2009 5:33 PM, "Grant Ingersoll" <> wrote:
> On Sep 14, 2009, at 12:23 PM, Tanton Gibbs wrote: > Hi, > > I'd like to
> start working more with th...
> Testing and profiling of the clustering, classification and collab filtering
> code would be very welcome.   There are several open issues in JIRA related
> to these (MAHOUT-165 comes to mind).
> I think just running some examples at scale and reporting back results would
> be great as well.  You can also start by looking at
> One idea is to take the Wikipedia examples I put up at
> (I will
> donate the code soon) and try running them at larger scale for Wikipedia.

View raw message