couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Davis" <paul.joseph.da...@gmail.com>
Subject Re: Efficient view design question
Date Mon, 27 Oct 2008 13:28:26 GMT
Jonathan,

That's there too. Same patch even. You can post an array of keys to
any defined or temporary view as well as _all_docs. Not sure if its in
the wiki yet or not.

Note: The post body should include something like

{"keys": ["key1", "key2"]}

And if you're hitting _all_docs, key1... would be document ids.

Paul

On Mon, Oct 27, 2008 at 9:08 AM, Jonathan Moss
<jonathan.moss@tangentlabs.co.uk> wrote:
> Paul,
>
> That makes sense :)
>
> As for using the include_docs parameter that is certainly one option. I also
> believe I saw something mentioned a while ago about being able to retrieve
> multiple docs from a single get request by providing a series of Ids. Was
> this just in discussion or does it already exist since I figure if I already
> have the Ids then I do not need to use a view for this?
>
> Thanks,
>
> Jon
>>
>> Jonathan,
>>
>> First off to alay your main concern, view indexes are not completely
>> regenerated on each update. Its only a diff.
>>
>> So, given we have a database with some built view. If a document X
>> changes in the db, the view serer deletes any rows in the view that
>> came from doc X, then runs the map view with the new version of the
>> doc adding back any of the rows.
>>
>> In this method, each time you request a view, its only updating the
>> data that's changed since the last view request.
>>
>> Other than that, as you point out, emitting the entire doc isn't
>> overly efficient. Things to consider are the relative recent addition
>> of the include_docs parameter. Also, there's a wiki page on working
>> with hierarchal data that's got some good ideas.
>>
>> HTH,
>> Paul Davis
>>
>> On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
>> <jonathan.moss@tangentlabs.co.uk> wrote:
>>
>>>
>>> Greetings all,
>>>
>>> I am currently writing a set of classes to handle php object model <->
>>> CouchDB. The PHP objects are hierarchical and I have modelled this as
>>> essentially a doubly linked list. So that every document within DouchDB
>>> has
>>> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
>>> related objects.
>>>
>>> I already have a couple of map functions to retrieve children and
>>> parents:
>>>
>>> "childrenOf": {
>>>     "map": "function(doc) {for(var idx in doc.Parents)
>>> {emit(doc.Parents[idx], doc);}}"
>>>  },
>>>  "parentsOf": {
>>>     "map": "function(doc) {for(var idx in doc.Children)
>>> {emit(doc.Children[idx], doc);}}"
>>>  }
>>>
>>> These functions return whole documents. My understanding of views is that
>>> these views would have to be re-generated every time a document is added,
>>> removed or updated. If this is the case then when the number of documents
>>> in
>>> the database starts getting larger, the initial response time to retrieve
>>> one of these views would become considerable. In a small, system where
>>> writes are un-common and reads regular. This would not be an issue.
>>> However,
>>> I am struggling to find more than a handful of niche applications were
>>> this
>>> would be true.  In almost all web application I have written, almost
>>> every
>>> request to the website will result in something (even if it is just
>>> tracking
>>> data) being written to the database. On a high volume website this would
>>> result in views having to be re-created almost constantly. Therefore
>>> efficient view design becomes paramount.
>>>
>>> The view functions shown above return the whole doc. Which is know is
>>> in-efficient. In fact since I already have the document I want the
>>> children/parents of, I also already have all the child/parent IDs. Would
>>> it
>>> be much more efficient to simply retrieve the parent/child documents
>>> individually rather than having to re-generate views all the time?
>>>
>>> As a side question - Having to re-generate views constantly in this kind
>>> of
>>> a situation could prove a real issue. I know that CouchDB is still
>>> pre-1.0
>>> release and the developers are necessarily focusing on 'getting is right'
>>> before 'getting it fast' (to coin a phrase :) but will improvements in
>>> speed
>>> already on the roadmap make these worries moot except in very large
>>> databases or is it always going to be an issue and therefore require some
>>> clever application design?
>>> e.g. keeping frequently updated data in a traditional SQL DB and only
>>> keep
>>> rarely updated data in CouchDB, which would be a shame.
>>>
>>> Thanks,
>>> Jon
>>>
>>>
>>
>>
>>
>
>

Mime
View raw message