couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: getting most recent doc
Date Mon, 19 Apr 2010 13:41:29 GMT
On Apr 17, 2010, at 11:09 AM, Eric Casteleijn wrote:

> On 04/16/2010 04:46 AM, wolfgang haefelinger wrote:
>> Thanks Robert
>> 
>> for your answer. However, it is not exactly what I was looking for
>> (due to my inappropriate problem description).
>> 
>> Firstly, I do want to have the document instead of the time stamp in
>> order to avoid that additional document fetch. That's obviously easy
>> to fix:
>> 
>> function(doc) { //
>>  emit([doc.name, doc.timestamp], doc);
>> }
> 
> Don't do that, it's unnecessary, because you can always call any view with '?include_docs=true'
and it will add a 'doc' member to each row, containing the document, and worse than that,
it's harmful, as it makes the indexes stored on disk many times larger than they need to be.
(Depending on the size of your documents this can really make a huge difference, anecdotal
evidence suggests: gwibber used to do this, and when I changed it, the indexes stored on disk
decreased some 90% in size.)
> 
> If you always want the whole document, just emit null for a value and always call the
view with include_docs.
> 
> If there are cases where you don't want the whole document, decide which data you need
and only emit that.

Hi Eric, I don't think its correct to have a blanket recommendation to always use include_docs=true.
 For large range queries on a view the query performance will be much better - up to 10x better
throughput on large DBs in my experience - if the doc is already included.  Yes, the view
index will balloon in size, but some people may be willing to make that tradeoff.  Cheers,

Adam

> 
> -- 
> eric casteleijn
> https://code.launchpad.net/~thisfred
> Canonical Ltd.
> 


Mime
View raw message