Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 73244 invoked from network); 15 Dec 2008 15:24:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Dec 2008 15:24:33 -0000 Received: (qmail 72512 invoked by uid 500); 15 Dec 2008 15:24:44 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 72483 invoked by uid 500); 15 Dec 2008 15:24:44 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 72472 invoked by uid 99); 15 Dec 2008 15:24:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Dec 2008 07:24:44 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of adam.kocoloski@gmail.com designates 74.125.46.30 as permitted sender) Received: from [74.125.46.30] (HELO yw-out-2324.google.com) (74.125.46.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Dec 2008 15:24:28 +0000 Received: by yw-out-2324.google.com with SMTP id 3so1061306ywj.5 for ; Mon, 15 Dec 2008 07:24:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=Pj5y+DMB+yy4lbIYTdsRk2dsZXd5v0EwVOlY55XxcLk=; b=JQbnF/oS2h9Cy7G2JvTq96h3/7t28s9rMJ7nbjSCIChRx3m3yOoYE78RaGOU266le1 8fjfV5rwqkGHmrITNAHNPxB8AAwTlUCiTwuZoPQhgC4k3Y73GJU+GVTtNHOVzxLDKBMP caBd0FFFZ/ERv1sxn6srcYrzpidSl/L+PM8/Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=WHDKDpwdFww/aaQyiqG6n9+D8JXMp7aPygSU5GRaosvbV//QwXaGnDK0hNVW34WP7o anvCQsdLYnFZj5GNclsdbgQ+/ynzWQ5clvXdsocBNFSztwR9L0Z5UOzgRyk1u7zCafN9 HuDzYHZ9JPfsXGBTL6d/Phf2lYI7cYxRZ7jzw= Received: by 10.100.105.15 with SMTP id d15mr4566049anc.31.1229354646962; Mon, 15 Dec 2008 07:24:06 -0800 (PST) Received: from COMPTON-SEVEN-FORTY-SIX.MIT.EDU (COMPTON-SEVEN-FORTY-SIX.MIT.EDU [18.109.7.235]) by mx.google.com with ESMTPS id c28sm5690378anc.7.2008.12.15.07.24.05 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 15 Dec 2008 07:24:06 -0800 (PST) Message-Id: From: Adam Kocoloski To: user@couchdb.apache.org In-Reply-To: <12C3EB1F-8F69-4998-BB83-F45564CC9F64@gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: Multiple search criteria with ranges Date: Mon, 15 Dec 2008 10:24:04 -0500 References: <12C3EB1F-8F69-4998-BB83-F45564CC9F64@gmail.com> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org Hi Dan, it's not a general-purpose solution, but in the specific example you gave where there's only one continuous variable you might be able to do something like the following. Use a map function that emits your filterable quantities as a list: emit([doc.beds, doc.baths, doc.price], null) and then query that view multiple times with: startkey=[4,2,350000]&endkey=[4,2,400000] startkey=[4,3,350000]&endkey=[4,3,400000] startkey=[5,2,350000]&endkey=[5,2,400000] startkey=[5,2,350000]&endkey=[4,3,400000] The advantage is that you get only the data you need; the disadvantage is that the number of queries scales non-linearly with the number of fields used in the filter, and there's no easy way to skip fields (you'd need to query with every possible value of that field, or else write an additional view that doesn't emit it). A possible middle ground would be to query this same view with a single request: startkey=[4,2,350000]&endkey=[5,3,400000] You'll still need to filter the results on the client side, since e.g. a 4 bed, 2 bath, $600k listing would get included, but at least the data volume would be smaller than doing the whole intersection yourself. If you go this route, the discrete vs. continuous variable thing doesn't really matter; just arrange the keys so that the one with the greatest discriminating power comes first. Best, Adam On Dec 14, 2008, at 12:06 PM, Dan Woolley wrote: > I'm researching Couchdb for a project dealing with real estate > listing data. I'm very interested in Couchdb because the schema > less nature, RESTful interface, and potential off-line usage with > syncing fit my problem very well. I've been able to do some > prototyping and search on ranges for a single field very > successfully. I'm having trouble wrapping my mind around views for > a popular use case in real estate, which is a query like: > > Price = 350000-400000 > Beds = 4-5 > Baths = 2-3 > > Any single range above is trivial, but what is the best model for > handling this AND scenario with views? The only thing I've been > able to come up with is three views returning doc id's - which > should be very fast - with an array intersection calculation on the > client side. Although I haven't tried it yet, that client side > calculation worries me with a potential document with 1M records - > the client would potentially be dealing with calculating the > intersection of multiple 100K element arrays. Is that a realistic > calculation? > > Please tell me there is a better model for dealing with this type of > scenario - or that this use case is not well suited for Couchdb at > this time and I should move along. > > > Dan Woolley > profile: http://www.linkedin.com/in/danwoolley > company: http://woolleyrobertson.com > product: http://dwellicious.com > blog: http://tzetzefly.com > > >