mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toby DiPasquale" <t...@cbcg.net>
Subject Re: Various Ideas from Mahout BOF
Date Sun, 13 Apr 2008 14:16:07 GMT
On Sun, Apr 13, 2008 at 6:41 AM, Grant Ingersoll <gsingers@apache.org> wrote:
> Back from ApacheCon EU and can't sleep...
>
>  Had a fairly decent BOF the other night given the stage we are at.  Got to
> meet both Isabel and Karl.  Had a good number of people in the room and had
> some nice discussions on the challenges we are facing.
>
>  One of the main discussions was on obtaining data to be used by anyone
> coming in to the project who may not have data.  There were a couple of good
> suggestions:  Apache mail archives, Apache web logs (possibly, after being
> dedacted), as well as the usual suspects like Wikipedia, Reuters collection
> (David Lewis), etc.  If someone wants to take up creating a simple util for
> cleaning up mail archives for use, that would be a great contribution.

What about http://infochimps.org/ ?

>  Also had some discussion on how Mahout will be accepted in the community.
> Will academics be interested?  Will companies be interested?  My take was
> that both are taking a wait and see approach, probably, but we do have some
> early supporters which seems to be positive.  I think a lot of companies
> view ML as "secret sauce", so we'll see.  However, I think this will go away
> as we build out our algorithms.  As w/ search, ML will become more of a
> commodity and it will be not that be so much that any company is using ML,
> but look at all the cool things your application can do with ML.

Speaking from the business side of things, I am very interested in
Mahout. A lot of places lack the internal resources to really build
and QA a lot of these algorithms ourselves. Having a vibrant community
around this is basically bring a little bit of R to the big-time, IMO.
I'd like nothing better than for ML to become commoditized on top of
Hadoop. And it seems as if there's plenty of interest from the
academic side, given all the volunteering for the GSoC for Mahout ;-)

-- 
Toby DiPasquale

Mime
View raw message