couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Moss <>
Subject Re: Efficient view design question
Date Mon, 27 Oct 2008 13:08:19 GMT

That makes sense :)

As for using the include_docs parameter that is certainly one option. I 
also believe I saw something mentioned a while ago about being able to 
retrieve multiple docs from a single get request by providing a series 
of Ids. Was this just in discussion or does it already exist since I 
figure if I already have the Ids then I do not need to use a view for this?


> Jonathan,
> First off to alay your main concern, view indexes are not completely
> regenerated on each update. Its only a diff.
> So, given we have a database with some built view. If a document X
> changes in the db, the view serer deletes any rows in the view that
> came from doc X, then runs the map view with the new version of the
> doc adding back any of the rows.
> In this method, each time you request a view, its only updating the
> data that's changed since the last view request.
> Other than that, as you point out, emitting the entire doc isn't
> overly efficient. Things to consider are the relative recent addition
> of the include_docs parameter. Also, there's a wiki page on working
> with hierarchal data that's got some good ideas.
> HTH,
> Paul Davis
> On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
> <> wrote:
>> Greetings all,
>> I am currently writing a set of classes to handle php object model <->
>> CouchDB. The PHP objects are hierarchical and I have modelled this as
>> essentially a doubly linked list. So that every document within DouchDB has
>> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
>> related objects.
>> I already have a couple of map functions to retrieve children and parents:
>> "childrenOf": {
>>      "map": "function(doc) {for(var idx in doc.Parents)
>> {emit(doc.Parents[idx], doc);}}"
>>  },
>>  "parentsOf": {
>>      "map": "function(doc) {for(var idx in doc.Children)
>> {emit(doc.Children[idx], doc);}}"
>>  }
>> These functions return whole documents. My understanding of views is that
>> these views would have to be re-generated every time a document is added,
>> removed or updated. If this is the case then when the number of documents in
>> the database starts getting larger, the initial response time to retrieve
>> one of these views would become considerable. In a small, system where
>> writes are un-common and reads regular. This would not be an issue. However,
>> I am struggling to find more than a handful of niche applications were this
>> would be true.  In almost all web application I have written, almost every
>> request to the website will result in something (even if it is just tracking
>> data) being written to the database. On a high volume website this would
>> result in views having to be re-created almost constantly. Therefore
>> efficient view design becomes paramount.
>> The view functions shown above return the whole doc. Which is know is
>> in-efficient. In fact since I already have the document I want the
>> children/parents of, I also already have all the child/parent IDs. Would it
>> be much more efficient to simply retrieve the parent/child documents
>> individually rather than having to re-generate views all the time?
>> As a side question - Having to re-generate views constantly in this kind of
>> a situation could prove a real issue. I know that CouchDB is still pre-1.0
>> release and the developers are necessarily focusing on 'getting is right'
>> before 'getting it fast' (to coin a phrase :) but will improvements in speed
>> already on the roadmap make these worries moot except in very large
>> databases or is it always going to be an issue and therefore require some
>> clever application design?
>> e.g. keeping frequently updated data in a traditional SQL DB and only keep
>> rarely updated data in CouchDB, which would be a shame.
>> Thanks,
>> Jon

View raw message