incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Davis" <paul.joseph.da...@gmail.com>
Subject Re: Efficient view design question
Date Mon, 27 Oct 2008 12:59:29 GMT
Jonathan,

First off to alay your main concern, view indexes are not completely
regenerated on each update. Its only a diff.

So, given we have a database with some built view. If a document X
changes in the db, the view serer deletes any rows in the view that
came from doc X, then runs the map view with the new version of the
doc adding back any of the rows.

In this method, each time you request a view, its only updating the
data that's changed since the last view request.

Other than that, as you point out, emitting the entire doc isn't
overly efficient. Things to consider are the relative recent addition
of the include_docs parameter. Also, there's a wiki page on working
with hierarchal data that's got some good ideas.

HTH,
Paul Davis

On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
<jonathan.moss@tangentlabs.co.uk> wrote:
> Greetings all,
>
> I am currently writing a set of classes to handle php object model <->
> CouchDB. The PHP objects are hierarchical and I have modelled this as
> essentially a doubly linked list. So that every document within DouchDB has
> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
> related objects.
>
> I already have a couple of map functions to retrieve children and parents:
>
> "childrenOf": {
>      "map": "function(doc) {for(var idx in doc.Parents)
> {emit(doc.Parents[idx], doc);}}"
>  },
>  "parentsOf": {
>      "map": "function(doc) {for(var idx in doc.Children)
> {emit(doc.Children[idx], doc);}}"
>  }
>
> These functions return whole documents. My understanding of views is that
> these views would have to be re-generated every time a document is added,
> removed or updated. If this is the case then when the number of documents in
> the database starts getting larger, the initial response time to retrieve
> one of these views would become considerable. In a small, system where
> writes are un-common and reads regular. This would not be an issue. However,
> I am struggling to find more than a handful of niche applications were this
> would be true.  In almost all web application I have written, almost every
> request to the website will result in something (even if it is just tracking
> data) being written to the database. On a high volume website this would
> result in views having to be re-created almost constantly. Therefore
> efficient view design becomes paramount.
>
> The view functions shown above return the whole doc. Which is know is
> in-efficient. In fact since I already have the document I want the
> children/parents of, I also already have all the child/parent IDs. Would it
> be much more efficient to simply retrieve the parent/child documents
> individually rather than having to re-generate views all the time?
>
> As a side question - Having to re-generate views constantly in this kind of
> a situation could prove a real issue. I know that CouchDB is still pre-1.0
> release and the developers are necessarily focusing on 'getting is right'
> before 'getting it fast' (to coin a phrase :) but will improvements in speed
> already on the roadmap make these worries moot except in very large
> databases or is it always going to be an issue and therefore require some
> clever application design?
> e.g. keeping frequently updated data in a traditional SQL DB and only keep
> rarely updated data in CouchDB, which would be a shame.
>
> Thanks,
> Jon
>

Mime
View raw message