mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Isabel Drost <>
Subject Re: Mahout interest at Berlin universities
Date Tue, 07 Jul 2009 19:51:06 GMT
On Monday 06 July 2009 18:54:40 Grant Ingersoll wrote:
> > However the basic assumptions and implications (e.g. data locality)
> > are known only by few groups/ people at least in the IR and data mining
> > domains. 
> This is always the case with new things.  It is impossible to keep up
> with all the things happening.


> It's why it is important to keep trying to raise visibility like we are
> doing. 
> FWIW, I see the same here, although Hadoop has a lot of buzz right now.

With all the buzz around I think it is also important that people realise that 
Hadoop is no silver bullet that solves all their parallel programming 
problems. I think one should explore the limitations as well.

> > Anytime I asked people using Apache software as to whether they are
> > subscribed to the corresponding user mailinglist the answer was a
> > questioning face and no as an answer. I tried to make clear why
> > participation is important - I guess we will see in the near future
> > whether I was successful ;) 
> Participation takes a whole other level of commitment.   People need
> to be able to quickly see the benefit or be willing to be on the
> cutting edge.  It's hard to join a project in the early stages because
> it may very well be the case that the project doesn't make it.  I
> think the ASF raises the chances of success, but it doesn't guarantee
> it.

+1 I'd guess for researchers it may be even harder: Doing a PhD is more than a 
fulltime job already. And doing open source work is not exactly recognised on 
ones list of scientific work...

> > I was surprised to see people only vaguely aware of the GSoC
> > program.
> GSOC is relatively small, so I don't find it that surprising.  And,
> they cut back this year, too.

Hmm, probably my own selective perception tricked me.

> > and found it hard to setup a demo application. I think having some
> > JavaDoc, tutorial, setup sort of documentation for each release version on
> > our website might help people getting started easier?
> I've been working on this a lot lately and agree it is important for
> us for 0.2. Some rework of the landing web page to include quicker 
> links to source, etc. would be helpful.


> Having some sites in production will also be useful, once we get
> there.  All in good time. 

I guess that might be the harder part currently as we are still in pretty 
early stages. Though there are people testing Mahout.

> The key right now is for us committers to 
> make sure we are reviewing patches, improving the code and helping new
> contributors feel welcome and help them become committers when
> appropriate.


> > My first thought was to prepare a task with the goal of building a new
> > blog "search engine". They could build a system that identifies
> > clusters of
> > blogs on a common topic, work on the link graph in the blogosphere,
> > detect
> > new emerging topics and the like. Before preparing the final seminar
> > proposal, I would like to ask you whether there is anything you
> > might want
> > those students to work on during their winter-term.
> That sounds pretty involved to get done in a semester, but maybe it
> depends on the level of student.  I could also see things like
> benchmarking, setting up clusters and running/tuning.  Creating demos,
> etc.  In other words, let them do a couple of projects.

Depends on the number of students - but yes, that was the idea.


QOTD: Words are the voice of the heart. 
  |\      _,,,---,,_       Web:   <>
  /,`.-'`'    -.  ;-;;,_  
 |,4-  ) )-,_..;\ (  `'-' 
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://>

View raw message