incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McDaniel <couc...@autosys.us>
Subject Re: Suggestions on View performance optimization/improvement
Date Wed, 01 Apr 2009 17:51:33 GMT
On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote:
> I'd be very interested to know the performance impact of that  
> optimization as well.  What is the overhead or bottleneck with large  
> view values?  Estimating 100 bytes per key/value pair within each of the  
> million documents, that's 2GB of raw data, which should write to a  
> laptop disk within 2 minutes.
>
> I'm wondering whether it matters how large the view values are, since  
> they would seem not to be involved in the view processing very  
> much--only written to disk in the order defined by the keys.
>
> Of course, that goes against the common wisdom that the fastest thing to  
> do is emit(key, null); but that could impact the application  
> significantly since you have to query again for the documents.  (I'm  
> unsure whether include_docs has a performance penalty either.)
>
> I guess what I'm asking is, why does the value side of views impact  
> performance so greatly?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 last I checked, there is an Erlang term() <-> JSON <-> Erlang term()
 conversion for values on the initial view (to/from view server)

~M

>
> kowsik wrote:
>> I would highly recommend that you do emit(doc.field, null) so that the
>> key space doesn't get unwieldy and large. Since the id of the document
>> is part of the map results, you can always fetch it using
>> include_docs=true.
>>
>> K.
>>
>> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar
>> <manjunath_somashekhar@yahoo.com> wrote:
>>> hi All,
>>>
>>> We have been using couchdb (built out of trunk) for prototyping an idea and would
like to thank and congratulate you folks for a simple and usable schema free db.
>>>
>>> We plan to store few million documents in couchdb and we would like to create
couple of views to fetch the data appropriately. We have inserted a million documents (each
containing about 20 fields). We are indexing/creating a view on a particular field of the
document. The map function of the view is simple straight forward emit (emit(doc.field, doc)).
It takes about 90 mins to build the required B-Tree index the first time. All the subsequent
queries are performing extremely well (milli second responses). Can anything be done to reduce
the 90 mins taken to build the required B-Tree index the first time?
>>>
>>> Environment details:
>>> Couchdb - 0.9.0a757326
>>> Erlang - 5.6.5
>>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 GNU/Linux
>>> Ubuntu distribution
>>> Centrino Dual core, 4GB RAM laptop
>>>
>>> Thanks
>>> Manju
>>>
>>>
>>>
>>>
>
> -- 
> Jason Smith
> Proven Corporation
> Bangkok, Thailand
> http://www.proven-corporation.com

-- 
Michael McDaniel
Portland, Oregon, USA
http://trip.autosys.us
http://autosys.us
http://mmcdaniel.com/erlview


Mime
View raw message