Return-Path: Delivered-To: apmail-lucene-mahout-dev-archive@minotaur.apache.org Received: (qmail 60343 invoked from network); 16 Apr 2010 18:57:06 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Apr 2010 18:57:06 -0000 Received: (qmail 60399 invoked by uid 500); 16 Apr 2010 18:57:05 -0000 Delivered-To: apmail-lucene-mahout-dev-archive@lucene.apache.org Received: (qmail 60358 invoked by uid 500); 16 Apr 2010 18:57:05 -0000 Mailing-List: contact mahout-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-dev@lucene.apache.org Delivered-To: mailing list mahout-dev@lucene.apache.org Received: (qmail 60350 invoked by uid 99); 16 Apr 2010 18:57:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Apr 2010 18:57:05 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of srowen@gmail.com designates 209.85.218.211 as permitted sender) Received: from [209.85.218.211] (HELO mail-bw0-f211.google.com) (209.85.218.211) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Apr 2010 18:56:57 +0000 Received: by bwz3 with SMTP id 3so2828047bwz.11 for ; Fri, 16 Apr 2010 11:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type :content-transfer-encoding; bh=m3VOOp4LN9MwnxgnY05wbSpf2mVzDf1nmll5/FlmgJQ=; b=Y+4VCCxM3VBw5sAWA0vo3bIxBGBNH43GE0GwTyyqpLlqTlsjDXo7wI+lRafnmzyfUB einNE2VfWpvs4P2GSHjv4s404pXELf4xjwKrz2carvAKj1jIzWuCJ8dKG7W70jvSAlVg NSHQOFqHhTqveFLK8OUgj2mIBqlVwVqB7zVhE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ZPDdrSTIKER9yxS1Grbq+Wbl2T5a+YbzonSHFYM69s/z0Fer/UzfSey4CLHP5j/W8w qazlXepbdHVK9gF3dpzTQO+Y1th/Dbx8ggp56ppKCiI7mGJT/euibS8JB6dqGlF9M346 C0xAt51+uvWv76NIYw6r0psGPSNC0ifI1N3SI= MIME-Version: 1.0 Received: by 10.239.187.79 with HTTP; Fri, 16 Apr 2010 11:56:37 -0700 (PDT) In-Reply-To: References: Date: Fri, 16 Apr 2010 19:56:37 +0100 Received: by 10.239.191.206 with SMTP id c14mr148103hbi.65.1271444197294; Fri, 16 Apr 2010 11:56:37 -0700 (PDT) Message-ID: Subject: Re: mahout/solr integration From: Sean Owen To: mahout-dev@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Fri, Apr 16, 2010 at 7:39 PM, Jake Mannix wrote: > I will start playing around with Anthony's github-based stuff, and > see where a patch can be made. =C2=A0The question is where it would > go? =C2=A0It's a fully functioning project already over on its own. I suppose that's my question too -- what is being fixed by a move? The point about integrating with the ML community by having a 'LISP-speaking' module, to be friendlier, is a good one. It does call into question the Mahout identity -- is it for tinkering with in a lab to explore new algorithms (for which Clojure/LISP makes sense)? or is it for engineers and production systems at scale -- where Hadoop/Java is the lingua franca? Yeah, this is not just another language, but for a somewhat different audience. Maybe "both" is nice. Before version 1.0 I think it can be harmful to let the project remit range too broadly. We all know how open-source goes. It's for-fun, spare-time. It's easy to start things and hard to finish them. I'm just getting concerned we end up with 10 half-finished modules rather than 5 finished ones. I don't have reason to believe this module would be orphaned; this is tilting at windmils. It's just a general concern raised by early expansion. After the foundation we have now is solid -- naturally, careful expansion is a next step. Do I hear consensus to think about this post-1.0, post TLP, post book? and continue working together to see where the projects go? (There's some value to staying separate -- forces you to not integrate the code in cheap and tangled ways -- have to proceed through public APIs.) Or is there a significant synergy from tight integration, which warrants combining projects right now? I don't want to make too much hay over this one question as much as bring up the larger issue. I wouldn't scream if Clojure landed in the repo.