Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 65471 invoked from network); 30 Jul 2009 07:44:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Jul 2009 07:44:00 -0000 Received: (qmail 32336 invoked by uid 500); 30 Jul 2009 07:44:00 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 32284 invoked by uid 500); 30 Jul 2009 07:44:00 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 32274 invoked by uid 99); 30 Jul 2009 07:44:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Jul 2009 07:44:00 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of b.candler@pobox.com designates 64.74.157.62 as permitted sender) Received: from [64.74.157.62] (HELO sasl.smtp.pobox.com) (64.74.157.62) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Jul 2009 07:43:51 +0000 Received: from localhost.localdomain (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id A0E1316427 for ; Thu, 30 Jul 2009 03:43:29 -0400 (EDT) Received: from mappit (unknown [80.45.95.114]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTPSA id 3A9D816426 for ; Thu, 30 Jul 2009 03:43:28 -0400 (EDT) Received: from brian by mappit with local (Exim 4.69) (envelope-from ) id 1MWQIR-0002T0-Kv for user@couchdb.apache.org; Thu, 30 Jul 2009 08:43:27 +0100 Resent-From: brian@uk.tiscali.com Resent-Date: Thu, 30 Jul 2009 08:43:27 +0100 Resent-Message-ID: <20090730074327.GC9281@uk.tiscali.com> Resent-To: user@couchdb.apache.org Date: Thu, 30 Jul 2009 08:43:03 +0100 From: Brian Candler To: Jochen Kempf Subject: Re: Problems with reduce in view appear when record size > 6 Message-ID: <20090730074302.GA9281@uk.tiscali.com> References: <6d692b470907271733o3edca47aj4b4664fb0abeb559@mail.gmail.com> <20090729093554.GC10984@uk.tiscali.com> <6d692b470907292048v11b96ea4hdac0d0e5afc525b6@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6d692b470907292048v11b96ea4hdac0d0e5afc525b6@mail.gmail.com> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Resent-Date: Thu, 30 Jul 2009 08:43:27 +0100 X-Pobox-Relay-ID: AC7C7B4A-7CDC-11DE-BE12-AEF1826986A2-28021239!a-pb-sasl-sd.pobox.com X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Jul 29, 2009 at 11:48:59PM -0400, Jochen Kempf wrote: > guessing that you refer to this page [1]incremental map No, I meant this one. http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views "Reduce functions must accept, as input, results emitted by its corresponding map function *as well as results returned by the reduce function itself*. The latter case is referred to as a rereduce" It then goes on to describe the two cases. > map => > " > function(doc) { > emit(doc["_id"], [doc["_id"], doc["_rev"], doc["var1"], doc["var2"], > doc["var3"], doc["var4"], doc["var5"]]); > } > " > reduce => > " > function(key, values, combine) { > var result = {ids:[], revs:[], variables:[]} > if (combine) { > for (i in values) { > result.ids.push(values[i].ids); > result.revs.push(values[i].revs); > result.variables.push(values[i].variables); > } > } else { > for (i in values) { > result.ids.push(values[i][0]); > result.revs.push(values[i][1]); > result.variables.push([values[i][2], values[i][3], > values[i][4], values[i][5], values[i][6]]); > } > } > return result; > } > " I think you want concat() rather than push() in the combine section. Otherwise, that looks like a working but extremely bad reduce function. Once your database goes above a certain size it will trigger a limit error in CouchDB; you can disable that error, but then you will suffer very poor performance as your database gets bigger. The problem is that your reduce value doesn't "reduce" the size of your output at all; the size of the reduce value will increase linearly with the size of the database. CouchDB stores the reduce value across the documents in a Btree node and its children within the Btree node. This means the root Btree node stores the reduce value across the entire database. This is very good for calculating reduce values quickly, but very bad if your reduce value becomes huge, as yours will, because it will become slower and slower to insert documents. See "Reduced Value Sizes" in the Wiki page linked to above. Basically this means you're doing it wrong. This sort of computation should be done in the client, not the database. If you really want to do it in the database, do it in a _list view. (This will still end up fetching and serializing all the documents in the database or the key range in question, but at least won't send them over the wire) Regards, Brian.