mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Input and ideas needed on a test framework for Mahout
Date Sun, 10 Apr 2011 16:50:31 GMT
I think it sounds like a great project.
I believe that one of the biggest barriers to improving performance is
simply understanding where the time is being spent. Is it I/O or CPU? is it
the combiner steps, shuffle? mapper, reducer?

What you are suggesting, and what I am sort of thinking of, sounds a lot
like what Apache Vaidya is doing (
http://hadoop.apache.org/common/docs/r0.20.2/vaidya.html). This is a great
project and perhaps something to build on.

It would be great to see the output of such a tool. I'm sure that it would
discover some clear, easy bottlenecks.

On Sun, Apr 10, 2011 at 4:19 PM, Oliver Fischer <o.b.fischer@swe-blog.net>wrote:

> Dear all,
>
> I would like to ask for your help and ideas.
>
> As I mentioned some days before, I will work within the next months on a
> performance test framework for Mahout. It will be called Thotti.
>
> Thotti shall be able to run arbitrary tests in a distributed environment
>  and support non-distributed and distributed algorithms. At the moment it is
> planned to utilize Amazon EC2 for distributed test execution. Thotti will
> also be able to generate reports on the test execution.
>
> Since Thotti should be community framework I need your help. Please let me
> know your expectation on a framework as Thotti.
>
> Best Regards,
>
> Oliver
>
> --
> Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
> Certified ScrumMaster, OMG Certified Expert in BPM - Fundamental
> Tel. +49 30 44793251, Mobil: +49 178 7903538
> Mail: o.b.fischer@swe-blog.net
> Blog: http://logbuch.freiheitsgrade-se.de
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message