From user-return-10847-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Fri Jun 04 21:50:25 2010 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 65088 invoked from network); 4 Jun 2010 21:50:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Jun 2010 21:50:25 -0000 Received: (qmail 47145 invoked by uid 500); 4 Jun 2010 21:50:24 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 46994 invoked by uid 500); 4 Jun 2010 21:50:24 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 46982 invoked by uid 99); 4 Jun 2010 21:50:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jun 2010 21:50:24 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of robert.newson@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jun 2010 21:50:17 +0000 Received: by wyb34 with SMTP id 34so1740159wyb.11 for ; Fri, 04 Jun 2010 14:49:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=qKxrn36alnAavyE348CyZ2kLnkcr4sCZ1vMfnexIjxM=; b=iVr53Jj5hnaoE8bojvBHmNg3dLWVRLIV0gmbd3fvBVGPX3aCrrH+g8YtdgFQoLwmUk lM04Q938mFK1VGgC0Q0HuTm/xd4XR/UpjabG3ztkzVFpg76314STAOvtFWXRDbf2XpYM HUP3n78tNSecL8a0ujJHPabuB7pZyoUEMFcR0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=adYly++if5gCHJOefdnD3EB9LwICTSeThBLPvK0aTDm5/1urawC4d7Si8Vp9nEy4Mo iQ/aHAKB8D370MhNjkQTiXIN+/BhqajEOq2ADovZEehIQeqrYahIrxRNk78gG+MskNGC OL0slewDRAsVd9NuRO7xtutDMsiAOiTztbQQA= MIME-Version: 1.0 Received: by 10.216.170.70 with SMTP id o48mr127655wel.71.1275688186593; Fri, 04 Jun 2010 14:49:46 -0700 (PDT) Received: by 10.216.27.201 with HTTP; Fri, 4 Jun 2010 14:49:46 -0700 (PDT) In-Reply-To: References: Date: Fri, 4 Jun 2010 22:49:46 +0100 Message-ID: Subject: Re: clucene and couchdb From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Everything I have testwise is at http://github.com/rnewson/couchdb-lucene I understand the reluctance to pull in the Java Virtual Machine just to use Lucene but, in my experience, there's no other comparable library for features or performance, including clucene. I'd love to see performance comparison numbers, though. I last benchmarked Java Lucene vs clucene many years ago (clucene had the edge) but that was pre-1.5 JVM technology. I think Java is way out in front now. B. On Fri, Jun 4, 2010 at 10:44 PM, Norman Barker wrote: > Robert, > > thanks, that makes sense I will do an eval on the design document > functions. Do you have test cases for Java Lucene and CouchDB that I > could use for comparison? > > I think a lot of people will want to use Java Lucene since CLucene is > behind Lucene (CLucene is always catching up) but I can't always use > Java and it will be good to do a comparison. > > thanks, > > Norman > > On Fri, Jun 4, 2010 at 3:34 PM, Robert Newson wrote: >> The reason couchdb-lucene requires you to write a javascript function >> is that there is no single mapping from a couchdb document to a Lucene >> Document that suits everyone. >> >> B. >> >> On Fri, Jun 4, 2010 at 10:31 PM, Norman Barker wrote: >>> Hi, >>> >>> I am writing a clucene indexer for CouchDB, I have >>> update_notifications and _fti as a db handler working. I am using >>> stdout/stdin for the communication and it is looking good. >>> >>> Looking at http://wiki.apache.org/couchdb/Full_text_search I see that >>> the index property in the design document is a javascript function and >>> I am wondering why? For views I can understand why you would want to >>> do an evaluation but for Lucene could we just use a JSON Path >>> reference? >>> >>> Thoughts appreciated, since I am in C++ and SpiderMonkey is available >>> I could do an eval of the javascript, but it might be easier just to >>> parse the JSON path. >>> >>> We will be putting this CLucene implementation in the public domain >>> once I have cleared the necessary internal paperwork. >>> >>> CLucene is dual license (Apache and LGPL) and I am using Cajun (BSD) >>> for the JSON parsing so should I host this separately or take out a >>> JIRA ticket to have it included in CouchDB? >>> >>> thanks, >>> >>> Norman >>> >> >