couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <...@proven-corporation.com>
Subject Re: Suggestions on View performance optimization/improvement
Date Wed, 01 Apr 2009 17:31:17 GMT
I'd be very interested to know the performance impact of that 
optimization as well.  What is the overhead or bottleneck with large 
view values?  Estimating 100 bytes per key/value pair within each of the 
million documents, that's 2GB of raw data, which should write to a 
laptop disk within 2 minutes.

I'm wondering whether it matters how large the view values are, since 
they would seem not to be involved in the view processing very 
much--only written to disk in the order defined by the keys.

Of course, that goes against the common wisdom that the fastest thing to 
do is emit(key, null); but that could impact the application 
significantly since you have to query again for the documents.  (I'm 
unsure whether include_docs has a performance penalty either.)

I guess what I'm asking is, why does the value side of views impact 
performance so greatly?

kowsik wrote:
> I would highly recommend that you do emit(doc.field, null) so that the
> key space doesn't get unwieldy and large. Since the id of the document
> is part of the map results, you can always fetch it using
> include_docs=true.
> 
> K.
> 
> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar
> <manjunath_somashekhar@yahoo.com> wrote:
>> hi All,
>>
>> We have been using couchdb (built out of trunk) for prototyping an idea and would
like to thank and congratulate you folks for a simple and usable schema free db.
>>
>> We plan to store few million documents in couchdb and we would like to create couple
of views to fetch the data appropriately. We have inserted a million documents (each containing
about 20 fields). We are indexing/creating a view on a particular field of the document. The
map function of the view is simple straight forward emit (emit(doc.field, doc)). It takes
about 90 mins to build the required B-Tree index the first time. All the subsequent queries
are performing extremely well (milli second responses). Can anything be done to reduce the
90 mins taken to build the required B-Tree index the first time?
>>
>> Environment details:
>> Couchdb - 0.9.0a757326
>> Erlang - 5.6.5
>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 GNU/Linux
>> Ubuntu distribution
>> Centrino Dual core, 4GB RAM laptop
>>
>> Thanks
>> Manju
>>
>>
>>
>>

-- 
Jason Smith
Proven Corporation
Bangkok, Thailand
http://www.proven-corporation.com

Mime
View raw message