Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 34435 invoked from network); 27 Oct 2008 13:29:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Oct 2008 13:29:05 -0000 Received: (qmail 23647 invoked by uid 500); 27 Oct 2008 13:29:08 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 23619 invoked by uid 500); 27 Oct 2008 13:29:08 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 23608 invoked by uid 99); 27 Oct 2008 13:29:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Oct 2008 06:29:08 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 74.125.92.149 as permitted sender) Received: from [74.125.92.149] (HELO qw-out-1920.google.com) (74.125.92.149) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Oct 2008 13:27:56 +0000 Received: by qw-out-1920.google.com with SMTP id 4so1106444qwk.54 for ; Mon, 27 Oct 2008 06:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=beY99qqqbUl42mHbNg52hauHWAaSjoIRaxSrWW7u9J0=; b=R/HPb9S+jHdl6u6/9rgk6NjnWPQrPioqU86h7m4lSQVvrf6Xm2dJLJrDbIEsNaxOFe simLAIVW+7y2c/FnCbRNf0lOxL/9PmF+RVPNUXFTRsA4yZr/8rYswD25RKR6a1QnLaDf CyaVVJclbZtgh5LcKFC2wNipWDt1OgnpZnYWA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=iJ4WqKJAbE3y8/A8GnGXaWm5BrYVDSfTo+YqQ8TLV5Kqqv3PL97kDA4+H6UvsQKFyE ZkPvqLCREelWb2muhy8m+SFVQyC5Dq+tLlHkhymGBx2A2QGXPBqSDeac5XcU61hPN53P SnRnL1OpJxuxUNdADlfCPgo2TzHRVFdNsp4dE= Received: by 10.214.113.9 with SMTP id l9mr2229787qac.192.1225114106238; Mon, 27 Oct 2008 06:28:26 -0700 (PDT) Received: by 10.214.215.21 with HTTP; Mon, 27 Oct 2008 06:28:26 -0700 (PDT) Message-ID: Date: Mon, 27 Oct 2008 09:28:26 -0400 From: "Paul Davis" To: couchdb-user@incubator.apache.org Subject: Re: Efficient view design question In-Reply-To: <4905BD43.5020509@tangentlabs.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1ac9e0120810261220h1528d201ta638c181b654829@mail.gmail.com> <0CFA983F-A3EA-43F5-9B80-76F818F546A4@apache.org> <4905A415.2090208@tangentlabs.co.uk> <4905BD43.5020509@tangentlabs.co.uk> X-Virus-Checked: Checked by ClamAV on apache.org Jonathan, That's there too. Same patch even. You can post an array of keys to any defined or temporary view as well as _all_docs. Not sure if its in the wiki yet or not. Note: The post body should include something like {"keys": ["key1", "key2"]} And if you're hitting _all_docs, key1... would be document ids. Paul On Mon, Oct 27, 2008 at 9:08 AM, Jonathan Moss wrote: > Paul, > > That makes sense :) > > As for using the include_docs parameter that is certainly one option. I also > believe I saw something mentioned a while ago about being able to retrieve > multiple docs from a single get request by providing a series of Ids. Was > this just in discussion or does it already exist since I figure if I already > have the Ids then I do not need to use a view for this? > > Thanks, > > Jon >> >> Jonathan, >> >> First off to alay your main concern, view indexes are not completely >> regenerated on each update. Its only a diff. >> >> So, given we have a database with some built view. If a document X >> changes in the db, the view serer deletes any rows in the view that >> came from doc X, then runs the map view with the new version of the >> doc adding back any of the rows. >> >> In this method, each time you request a view, its only updating the >> data that's changed since the last view request. >> >> Other than that, as you point out, emitting the entire doc isn't >> overly efficient. Things to consider are the relative recent addition >> of the include_docs parameter. Also, there's a wiki page on working >> with hierarchal data that's got some good ideas. >> >> HTH, >> Paul Davis >> >> On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss >> wrote: >> >>> >>> Greetings all, >>> >>> I am currently writing a set of classes to handle php object model <-> >>> CouchDB. The PHP objects are hierarchical and I have modelled this as >>> essentially a doubly linked list. So that every document within DouchDB >>> has >>> a 'Children' array and a 'Parents' array. These arrays contain the Ids or >>> related objects. >>> >>> I already have a couple of map functions to retrieve children and >>> parents: >>> >>> "childrenOf": { >>> "map": "function(doc) {for(var idx in doc.Parents) >>> {emit(doc.Parents[idx], doc);}}" >>> }, >>> "parentsOf": { >>> "map": "function(doc) {for(var idx in doc.Children) >>> {emit(doc.Children[idx], doc);}}" >>> } >>> >>> These functions return whole documents. My understanding of views is that >>> these views would have to be re-generated every time a document is added, >>> removed or updated. If this is the case then when the number of documents >>> in >>> the database starts getting larger, the initial response time to retrieve >>> one of these views would become considerable. In a small, system where >>> writes are un-common and reads regular. This would not be an issue. >>> However, >>> I am struggling to find more than a handful of niche applications were >>> this >>> would be true. In almost all web application I have written, almost >>> every >>> request to the website will result in something (even if it is just >>> tracking >>> data) being written to the database. On a high volume website this would >>> result in views having to be re-created almost constantly. Therefore >>> efficient view design becomes paramount. >>> >>> The view functions shown above return the whole doc. Which is know is >>> in-efficient. In fact since I already have the document I want the >>> children/parents of, I also already have all the child/parent IDs. Would >>> it >>> be much more efficient to simply retrieve the parent/child documents >>> individually rather than having to re-generate views all the time? >>> >>> As a side question - Having to re-generate views constantly in this kind >>> of >>> a situation could prove a real issue. I know that CouchDB is still >>> pre-1.0 >>> release and the developers are necessarily focusing on 'getting is right' >>> before 'getting it fast' (to coin a phrase :) but will improvements in >>> speed >>> already on the roadmap make these worries moot except in very large >>> databases or is it always going to be an issue and therefore require some >>> clever application design? >>> e.g. keeping frequently updated data in a traditional SQL DB and only >>> keep >>> rarely updated data in CouchDB, which would be a shame. >>> >>> Thanks, >>> Jon >>> >>> >> >> >> > >