From couchdb-user-return-186-apmail-incubator-couchdb-user-archive=incubator.apache.org@incubator.apache.org Fri Apr 11 11:11:41 2008 Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 69649 invoked from network); 11 Apr 2008 11:11:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Apr 2008 11:11:41 -0000 Received: (qmail 70009 invoked by uid 500); 11 Apr 2008 11:11:41 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 69933 invoked by uid 500); 11 Apr 2008 11:11:40 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 69915 invoked by uid 99); 11 Apr 2008 11:11:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Apr 2008 04:11:40 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [89.16.172.240] (HELO bytesexual.vm.bytemark.co.uk) (89.16.172.240) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Apr 2008 11:10:59 +0000 Received: from nslater by bytesexual.vm.bytemark.co.uk with local (Exim 4.69) (envelope-from ) id 1JkH9z-00068X-7v; Fri, 11 Apr 2008 12:11:11 +0100 Date: Fri, 11 Apr 2008 12:11:11 +0100 From: Noah Slater To: couchdb-dev@incubator.apache.org Cc: couchdb-user@incubator.apache.org Subject: Re: Lazy Fulltext Search Message-ID: <20080411111110.GA22960@bytesexual.org> Mail-Followup-To: couchdb-dev@incubator.apache.org, couchdb-user@incubator.apache.org References: <20080411102355.GP29836@bytesexual.org> <3F0276A4-6C09-4BB6-8227-B17B6CDDAE18@apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0276A4-6C09-4BB6-8227-B17B6CDDAE18@apache.org> Organization: The Apache Software Foundation X-Noah: Awesome User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: nslater@bytesexual.org X-SA-Exim-Scanned: No (on bytesexual.vm.bytemark.co.uk); SAEximRunCond expanded to false X-Virus-Checked: Checked by ClamAV on apache.org On Fri, Apr 11, 2008 at 12:37:30PM +0200, Jan Lehnardt wrote: > The associated benefit is that you delay the costs of generation of > indexes until you actually need them. If you're generating indexes JIT, you can't really count them as indexes any more, you're essentially doing regular non-indexed searching. I would have thought that for a database the trade-off you want to make is one where you sacrifice time/resources in bulk so that queries are lighting fast. If you move the indexing to query time you still have to expend exactly the same time/resources as before and you have slowed down your query response time significantly. For large collections of documents, indexing could easily take hours to complete. >> My understanding is that the KEY element of CouchDB Wiews is that they are >> generated in advance, and incrementally, before you use them. > > And why not use the same principle fot fulltext indexes? I thought this was the original plan for the full text search, that the index was built in advance and incrementally before you use it. It sounds to me like you're suggesting a departure away from this. Maybe I am getting confused. -- Noah Slater - The Apache Software Foundation