mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: mahout/solr integration
Date Fri, 16 Apr 2010 18:56:37 GMT
On Fri, Apr 16, 2010 at 7:39 PM, Jake Mannix <> wrote:
> I will start playing around with Anthony's github-based stuff, and
> see where a patch can be made.  The question is where it would
> go?  It's a fully functioning project already over on its own.

I suppose that's my question too -- what is being fixed by a move?

The point about integrating with the ML community by having a
'LISP-speaking' module, to be friendlier, is a good one. It does call
into question the Mahout identity -- is it for tinkering with in a lab
to explore new algorithms (for which Clojure/LISP makes sense)? or is
it for engineers and production systems at scale -- where Hadoop/Java
is the lingua franca? Yeah, this is not just another language, but for
a somewhat different audience.

Maybe "both" is nice. Before version 1.0 I think it can be harmful to
let the project remit range too broadly. We all know how open-source
goes. It's for-fun, spare-time. It's easy to start things and hard to
finish them. I'm just getting concerned we end up with 10
half-finished modules rather than 5 finished ones. I don't have reason
to believe this module would be orphaned; this is tilting at windmils.
It's just a general concern raised by early expansion.

After the foundation we have now is solid -- naturally, careful
expansion is a next step. Do I hear consensus to think about this
post-1.0, post TLP, post book? and continue working together to see
where the projects go? (There's some value to staying separate --
forces you to not integrate the code in cheap and tangled ways -- have
to proceed through public APIs.)

Or is there a significant synergy from tight integration, which
warrants combining projects right now?

I don't want to make too much hay over this one question as much as
bring up the larger issue. I wouldn't scream if Clojure landed in the

View raw message