Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 85684 invoked from network); 5 Jul 2009 00:31:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Jul 2009 00:31:16 -0000 Received: (qmail 62070 invoked by uid 500); 5 Jul 2009 00:31:25 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 61981 invoked by uid 500); 5 Jul 2009 00:31:25 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 61970 invoked by uid 99); 5 Jul 2009 00:31:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 Jul 2009 00:31:25 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.210.204 as permitted sender) Received: from [209.85.210.204] (HELO mail-yx0-f204.google.com) (209.85.210.204) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 Jul 2009 00:31:17 +0000 Received: by yxe42 with SMTP id 42so1276341yxe.13 for ; Sat, 04 Jul 2009 17:30:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=w9v7SbHZ7agRoFWfwqZGbfD9cbGBDtxYUKvuCq0dNQk=; b=sR9BIWJQfjRk9Q0E9RaVFicTfY6Xvn80x0nP+z80rGmYbN++jBTUhIG5QqeKs/31h6 vC45e1nAQX/s56DFGhPE1HUd/RK2PHPcCdrYKGm1q3STVgdAckvgxV9OXDMVcObWQmxp kfWaKg3Kjubaog6baEZdA4bV1x+C/EicXQDto= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=NoSCGTKi0DdfkSJL1tiQ3RT3m5Hojt+YZgXXHVfSD4btWXknQs+X/4GLGrS25CjVHF xx3ybnv/u7MPNWytN1cjlwiWtitHoHrQwzpvPTNJBsmB2+kEXRBWbk+S/uxYWp+X3WtC T6tEKtRBsOWz5LOgi68Gb7miCC3IH1lkg09ss= MIME-Version: 1.0 Received: by 10.100.164.17 with SMTP id m17mr5297851ane.29.1246753856365; Sat, 04 Jul 2009 17:30:56 -0700 (PDT) In-Reply-To: <4A4FF0E4.4040804@krampe.se> References: <4A4E7F53.7010406@krampe.se> <4A4FF0E4.4040804@krampe.se> Date: Sat, 4 Jul 2009 20:30:55 -0400 Message-ID: Subject: Re: Possible bug in indexer... (really) From: Paul Davis To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org 2009/7/4 G=F6ran Krampe : > Adam Kocoloski wrote: >> >> Not sure if it's described, but it is by design. =A0The reduce function >> executes when the btree is modified. =A0We can't afford to cache KVs fro= m an >> index update in memory regardless of size; we have to set some threshold >> when we flush them to disk. > > And I presume you can't write KVs *without* doing the reduce? > > When I wrote "described" I am referring to the blog post by Ricky Ho btw.= It > seems to imply a strict ordering, map -> reduce -> rereduce. IIRC. > That was probably just the theoretical aspect. Map's always happen first obviously, and then when the key/values are inserted into the btree during a flush the entire tree is built which means that > 0 reduces are called and then re-reduces are run to fill out the tree. At the moment we aren't delaying re-reduce calls because it'd require a major overhaul to the btree code. >> I think the fundamental question is why the flush operations were >> occurring so frequently the second time around. =A0Is it because you wer= e >> building up a largish hash for the reduce value? =A0Probably. =A0Neverth= eless, >> I'd like to have a better handle on that. > > Yeah, well, I am on vacation now - but some other guys are not. We could = of > course start by trying to rewrite this the Right Way first as Chris said. > > I am curious if it can be done using grouping because we dismissed groupi= ng > due to its relatively slow performance (it runs lots of reduces at query > time IIRC) :) > > Btw, the solution used now DOES return the map for a full year in about 2= 30 > ms, including parsing on client side. So query time was perfectly fine, b= ut > view generation was not. This shows to me that it *can* work. > > regards, G=F6ran > >