mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Input and ideas needed on a test framework for Mahout
Date Sun, 10 Apr 2011 16:50:31 GMT
I think it sounds like a great project.
I believe that one of the biggest barriers to improving performance is
simply understanding where the time is being spent. Is it I/O or CPU? is it
the combiner steps, shuffle? mapper, reducer?

What you are suggesting, and what I am sort of thinking of, sounds a lot
like what Apache Vaidya is doing ( This is a great
project and perhaps something to build on.

It would be great to see the output of such a tool. I'm sure that it would
discover some clear, easy bottlenecks.

On Sun, Apr 10, 2011 at 4:19 PM, Oliver Fischer <>wrote:

> Dear all,
> I would like to ask for your help and ideas.
> As I mentioned some days before, I will work within the next months on a
> performance test framework for Mahout. It will be called Thotti.
> Thotti shall be able to run arbitrary tests in a distributed environment
>  and support non-distributed and distributed algorithms. At the moment it is
> planned to utilize Amazon EC2 for distributed test execution. Thotti will
> also be able to generate reports on the test execution.
> Since Thotti should be community framework I need your help. Please let me
> know your expectation on a framework as Thotti.
> Best Regards,
> Oliver
> --
> Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
> Certified ScrumMaster, OMG Certified Expert in BPM - Fundamental
> Tel. +49 30 44793251, Mobil: +49 178 7903538
> Mail:
> Blog:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message