Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 71E44E5D5 for ; Thu, 6 Dec 2012 23:19:15 +0000 (UTC) Received: (qmail 68850 invoked by uid 500); 6 Dec 2012 23:19:13 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 68810 invoked by uid 500); 6 Dec 2012 23:19:13 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 68801 invoked by uid 99); 6 Dec 2012 23:19:13 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2012 23:19:13 +0000 Received: from localhost (HELO mail-vc0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2012 23:19:13 +0000 Received: by mail-vc0-f180.google.com with SMTP id p16so6457733vcq.11 for ; Thu, 06 Dec 2012 15:19:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.58.2.71 with SMTP id 7mr2745789ves.42.1354835952027; Thu, 06 Dec 2012 15:19:12 -0800 (PST) Received: by 10.52.68.209 with HTTP; Thu, 6 Dec 2012 15:19:11 -0800 (PST) In-Reply-To: References: Date: Thu, 6 Dec 2012 23:19:11 +0000 Message-ID: Subject: Re: growable arrays in reductions From: Robert Newson To: "user@couchdb.apache.org" Content-Type: text/plain; charset=ISO-8859-1 If reduce_limit didn't bite you, and you have plenty of documents, you're probably fine. It does sound like you're skating near the edge, though. The reason for the warning is that intermediate reduce values are stored in the b+tree, so if they grow, rather than shrink, the b+tree becomes progressively slower (i.e, we start violating the constraints that make b+tree's work). B. On 6 December 2012 23:08, Will Heger wrote: > In the end, I could write a list a function, but I do so at the cost > of caching and incremental update. For example, I have a grocery cart > that is described by a series of transactions, items added, items > removed. If I wanted to keep a total bill, taxes, item count, in a > reduction, that would be a pretty canonical reduction along the lines > of the Event Sourcing design pattern. > > My question is whether appending a list of the underlying transaction > ids would create a problem. > > "As a rule of thumb, the data returned by reduce functions should > remain "smallish" and not grow faster than log(num_rows_processed)." > > I'm not totally clear on how to parse this statement. Is the size > related to size of mapped input documents? Collectively or > individually measured? > > Having the underlying id's would allow me to "close-the-loop" from a > transactions standpoint. For example, I have ten different clients > contributing to this one cart. Any particular client can then > instantly recognize whether her contribution is factored into the > summary by scanning for her id within the transaction list. > > There are other methods for achieving this, but if this is not going > to cause a problem, it is presently the most elegant for my > application. But beyond this, I'm just interested in what amount of > growth is allowable. > > "From 0.10 onwards, CouchDB uses a heuristic to detect reduce > functions that won't scale to give the developer an early warning" > > So far Couch has not complained to me about any of the reductions I've > written, but I still feel like I'm flying a bit blind.