From couchdb-user-return-1377-apmail-incubator-couchdb-user-archive=incubator.apache.org@incubator.apache.org Fri Sep 26 10:38:48 2008 Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 70710 invoked from network); 26 Sep 2008 10:38:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Sep 2008 10:38:48 -0000 Received: (qmail 70426 invoked by uid 500); 26 Sep 2008 10:38:44 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 70401 invoked by uid 500); 26 Sep 2008 10:38:43 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 70390 invoked by uid 99); 26 Sep 2008 10:38:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2008 03:38:43 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2008 10:37:43 +0000 Received: from dahlia.lan (f053000110.adsl.alicedsl.de [::ffff:78.53.0.110]) (AUTH: LOGIN jan, TLS: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Fri, 26 Sep 2008 10:32:14 +0000 Message-Id: <1E0C812D-E968-4AEB-BA64-1DBDB2C1B099@apache.org> From: Jan Lehnardt To: couchdb-user@incubator.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v928.1) Subject: Re: Multiple filters on a large data set Date: Fri, 26 Sep 2008 12:31:50 +0200 References: X-Mailer: Apple Mail (2.928.1) X-Virus-Checked: Checked by ClamAV on apache.org Hello Jaap, On Sep 25, 2008, at 20:25 , Jaap van der Plas wrote: > We like to build a database with approx. 50.000 documents. Every > document has at least about 10 fields. We think CouchDB would be nice > solution because there's a big variety of documents (like different > fields). I'd agree on the assessment that CouchDB is a good fit here. > We intend to use this for an online catelog. We like end users to be > able to search with multiple fields (like filters). Any combination of > fields should be possible. > > 1. Is this possible to do this without creating temp_views for every > query? If you want to do fulltext search, I'd second Ayende's suggestion of using the external search API to let Lucene do this kind of indexing and searching. Views can get you there if you create a comprehensive index of all fields and all their combinations (possible, probably not nice) and allow for a keyword search. You could even split up your filter words and put single chars, tuples and so on into the index to fake fulltext search. This is possible but probably not the most elegant solution. > 2. If not, is using temp_views viable performance wise on this sort of > dataset. Using temp views is not wise unless you want to test view function definitions on rather little data. A production system should not rely on them. (This is again, a rule of thumb, but with a pretty big thumb!) Cheers Jan --