Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 21896 invoked from network); 24 Jan 2011 15:43:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Jan 2011 15:43:29 -0000 Received: (qmail 29136 invoked by uid 500); 24 Jan 2011 15:43:28 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 28860 invoked by uid 500); 24 Jan 2011 15:43:25 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 28852 invoked by uid 99); 24 Jan 2011 15:43:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Jan 2011 15:43:24 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of brendon.mclean@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Jan 2011 15:43:17 +0000 Received: by wyb28 with SMTP id 28so4198245wyb.11 for ; Mon, 24 Jan 2011 07:42:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:content-type:mime-version:subject:from :in-reply-to:date:content-transfer-encoding:message-id:references:to :x-mailer; bh=EiDOWAhMDh4/A6TO7KkrfUGT0R9HoKf1lJbkVBD0cH0=; b=BvEcyvXP6SWHmXiyOYdeqMyHlvzBmy1AEOLTIc9Ifq/Un/p3LuY4ZhHln6Tw4nvTPj rBhRjWL10F0XSmLB5yQsw/Efl0Li+9b+mBfXC7MFHmgDceJRxCCHjUi0ZMOVwJoaVLy/ R5cNzIKqwLSs4B6Dc+PpTj/Car2njNwV+Dz6E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=QDD61oMjKkLfdNjmC7BNWbJlxCRTp9tHs7g1TWR9pS/zXTowMyWHEHbLNCH/Ui8C5/ jwxsZPhjpUTG/n2HYpmzQheOKpeYlvVQmbb32X/mwsPW0wJW6q+voFcrp4zzH622I4W5 UeVZTbRH4Xh/pcSbYYPSUmGwOBj4demRw9snE= Received: by 10.227.143.66 with SMTP id t2mr714719wbu.83.1295883775352; Mon, 24 Jan 2011 07:42:55 -0800 (PST) Received: from [192.168.1.103] (dsl-185-86-75.dynamic.wa.co.za [41.185.86.75]) by mx.google.com with ESMTPS id m13sm9367910wbz.21.2011.01.24.07.42.53 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 24 Jan 2011 07:42:54 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Strategy for arbitrary predicate queries in Couchdb From: Brendon McLean In-Reply-To: <4D3D88E3.5080507@zedeler.dk> Date: Mon, 24 Jan 2011 17:42:50 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <37DAF263-40BB-4D56-A9DD-202E2B95CE32@gmail.com> References: <38934C4C-127F-4892-8236-A7957BFAA1A3@gmail.com> <4D3D88E3.5080507@zedeler.dk> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1082) Hi Michael, I'm not sure how you would do that. The predicates are non-exclusive so = I'd have to literally perform the aggregation outside of the database. = I suppose if you went this way, you could check the document matches for = each predicate, choosing the smallest set to examine outside of the = database. Regards, Brendon. On 24 Jan 2011, at 16:12 , Michael Zedeler wrote: > On 2011-01-24 13:49, Brendon McLean wrote: >> Our documents really only contain two types of data: >>=20 >> Numeric attributes >> Boolean attributes >> The boolean attributes essentially mark a document as belonging to = one or more non-exclusive sets. The numeric attributes will always only = need to be summed. One way of structuring the document is like this: >>=20 >> { >> "id": 3123123, >> "attr": {"x": 2, "y": 4, "z": 6}, >> "sets": ["A", "B", "C"] >> } >> With this structure, it's easy to work out aggregate x, y, z values = for the sets A, B and C, but it gets more complicated when you want to = see the aggregates for intersections like A&C. >>=20 >> In this small case I could emit keys for all permutations of ABC ("A, = B, C, AB, AC, BC, ABC"), but I'm worried about how this will scale. Our = documents could belong to some combination of 80 sets and it is fronted = by a user-interface which can construct any conceivable combination of = them. >>=20 >> I'm inclined to think that this isn't a job for a CouchDB, and = perhaps MongoDB or something else would be better suited to this = problem. > We have solved the problem for attributes by creating one index for = each attribute (running one query for each index) and then just = aggregating the results after. >=20 > Regards, >=20 > Michael. >=20