Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 75446 invoked from network); 6 Feb 2009 07:56:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Feb 2009 07:56:14 -0000 Received: (qmail 66649 invoked by uid 500); 6 Feb 2009 07:56:12 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 66610 invoked by uid 500); 6 Feb 2009 07:56:12 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 66599 invoked by uid 99); 6 Feb 2009 07:56:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Feb 2009 23:56:12 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of b.candler@pobox.com designates 207.106.133.19 as permitted sender) Received: from [207.106.133.19] (HELO sasl.smtp.pobox.com) (207.106.133.19) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Feb 2009 07:56:04 +0000 Received: from localhost.localdomain (unknown [127.0.0.1]) by a-sasl-fastnet.sasl.smtp.pobox.com (Postfix) with ESMTP id 9465F97F6A; Fri, 6 Feb 2009 02:55:40 -0500 (EST) Received: from mappit (unknown [80.45.95.114]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by a-sasl-fastnet.sasl.smtp.pobox.com (Postfix) with ESMTPSA id E0EE997F69; Fri, 6 Feb 2009 02:55:38 -0500 (EST) Received: from brian by mappit with local (Exim 4.69) (envelope-from ) id 1LVLYn-00041c-8X; Fri, 06 Feb 2009 07:55:37 +0000 Date: Fri, 6 Feb 2009 07:55:37 +0000 From: Brian Candler To: Jeremy Wall Cc: user@couchdb.apache.org Subject: Re: Reduce to nothing Message-ID: <20090206075537.GB15275@uk.tiscali.com> References: <7c40ded80902050644y644678cdj6be04ca8dddf1b90@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7c40ded80902050644y644678cdj6be04ca8dddf1b90@mail.gmail.com> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-Pobox-Relay-ID: 8C4C9A90-F423-11DD-8219-8B21C92D7133-28021239!a-sasl-fastnet.pobox.com X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Feb 05, 2009 at 08:44:16AM -0600, Jeremy Wall wrote: > Is it possible to reduce a key to nothing, i.e. completely remove a > key from the reduction result. > > For instance, say you post three documents: > > {"_id": "thing1", "type": "thing"} > {"_id": "thing2", "type": "thing"} > {"_id": "...", "type": "cancellation", "cancels": "thing1"} > > It's trivial to produce a map function that collates the "thing" and > "cancellation" documents. However, I can't work out how, or even it > it's possible, to reduce the view that so that only "thing2" remains. Nearest I can think of is to collate your view such that the cancellation comes immediately after the thing: ["thing1","thing"] ["thing1","cancellation"] Then the client can see that these two are adjacent and easily check if the item has been cancelled. Unfortunately, you can't rely on this in the reduce function, because sometimes the thing will be in one block of keys/values and the cancellation will be in another. If the purpose of your reduce view is only to _count_ how many live things you have, then you could map: ["thing1",1] # thing ["thing1",-1] # cancellation and then sum. This won't be right if you can have multiple cancellations for a thing, but you could avoid this by choosing your doc id naming convention for cancellations (e.g. "thing1_cancel"). In any case, if you return a grouped reduce, and see a negative value for a particular key, you know it has been cancelled. > I tried not returning anything, just in case it worked ;-), but got a > JSON encoding error (can't encode undefined, iirc). You can encode null, though. > However, I wondered if the more "normal" approach of > allowing a reduce function to emit zero or more (key, value) pairs > would be even better? I think reduce functions have to return a single value - at least, all the ones I've seen do this. IIUC, all the k/v pairs pointed to by a single b-tree node are reduced to one value, which is stored within the same b-tree node. Then the parent b-tree nodes contain the reduction of their children. The root node contains the reduction of everything to a single value, and this is what you get if you query without group=true. If you query with startkey and endkey then the reduce value is recalculated across the range of keys you specify. So a reduce function is not a filter on map output, but an aggregation / summarisation function. Regards, Brian.