Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 41847 invoked from network); 22 Nov 2008 20:57:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Nov 2008 20:57:36 -0000 Received: (qmail 79468 invoked by uid 500); 22 Nov 2008 20:57:45 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 79425 invoked by uid 500); 22 Nov 2008 20:57:45 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 79414 invoked by uid 99); 22 Nov 2008 20:57:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Nov 2008 12:57:45 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nunojobpinto@gmail.com designates 66.249.92.168 as permitted sender) Received: from [66.249.92.168] (HELO ug-out-1314.google.com) (66.249.92.168) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Nov 2008 20:56:18 +0000 Received: by ug-out-1314.google.com with SMTP id 36so487277uga.17 for ; Sat, 22 Nov 2008 12:56:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=rYIR8PXY8QdcHGbr8dD21UwQ7jDuBWvwWBkfSsPUApA=; b=xWbetpVhhzDm/QheDQVLr++w05pqPq2fRPshzXvr793rwOD8dyhNpypGcdU8CR2o7f 3/plpel2Le2+veeuMOtaXSXv105HjH2G7OmxtyCGPvgMJSnZ9+g5a/QIZ0DIhOH2ke0E HiP8RRJKhZCrm/ZinZXbVjMquD7QgqOJW42dM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=KsFsw0UjUvE5Drs4w2Mf1r2By1L5GPoqKfLVqhtt/tZYVoVFrD//SWI8LJDFwEs8qX 5glbO8IHmg1WbGw7PzYrbnKOdcA9sOA/I0ky31RGmsuPZLLXTtABA+ALavGDaIZVyXJA yLis8vPalLYnRdQ5gJSTtKzkZsjUil0nRDoDQ= Received: by 10.67.115.14 with SMTP id s14mr1026356ugm.57.1227387413542; Sat, 22 Nov 2008 12:56:53 -0800 (PST) Received: by 10.67.19.12 with HTTP; Sat, 22 Nov 2008 12:56:53 -0800 (PST) Message-ID: <30d0cf2c0811221256v4a4349f5w639c69a2c159f8e3@mail.gmail.com> Date: Sat, 22 Nov 2008 15:56:53 -0500 From: "Nuno Job" To: couchdb-user@incubator.apache.org Subject: Re: Map/Reduce takes lots of time every request In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_59977_9772583.1227387413524" References: <49286ACA.4020606@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_59977_9772583.1227387413524 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline Slightly offtopic: Anyone saw Simon Peyton Jones talking about making a generic mapreduce-like in haskell? http://tinyurl.com/537apv On Sat, Nov 22, 2008 at 3:53 PM, Chris Anderson wrote: > On Sat, Nov 22, 2008 at 12:25 PM, maddiin wrote: > > > > Do you have any advice what I am doing wrong and how I could speed this > up? > > > I'm curious how long it takes with reduce=false (should be limited > basically by IO). > > I'm almost certain (please correct me if I'm wrong) that reduce > requests must call the JavaScript interpreter at least once per > request, to rereduce the btree inner-nodes that fit in that request > range. This means for group=true requests, the rereduce function must > run once per unique key (at minimum). That would be the source of your > slowness. It sounds like you are building a tag-cloud. The smart money > would be on caching the results of that operation, which is standard > practice with SQL based tag clouds as well. > > If you're not doing a tag cloud, maybe there's a way you can get the > needed results using map only? > > Also, I'm not sure, but perhaps it would be possible for CouchDB to > cache final reduce values in the btree as well, so that group=true > queries can save the cost of the final rereduce (and make subsequent > queries fast...) > > Chris > > > -- > Chris Anderson > http://jchris.mfdz.com > ------=_Part_59977_9772583.1227387413524--