couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Suggestions on View performance optimization/improvement
Date Wed, 01 Apr 2009 17:54:58 GMT
On Wed, Apr 1, 2009 at 1:51 PM, Michael McDaniel <couchdb@autosys.us> wrote:
> On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote:
>> I'd be very interested to know the performance impact of that
>> optimization as well.  What is the overhead or bottleneck with large
>> view values?  Estimating 100 bytes per key/value pair within each of the
>> million documents, that's 2GB of raw data, which should write to a
>> laptop disk within 2 minutes.
>>
>> I'm wondering whether it matters how large the view values are, since
>> they would seem not to be involved in the view processing very
>> much--only written to disk in the order defined by the keys.
>>
>> Of course, that goes against the common wisdom that the fastest thing to
>> do is emit(key, null); but that could impact the application
>> significantly since you have to query again for the documents.  (I'm
>> unsure whether include_docs has a performance penalty either.)
>>
>> I guess what I'm asking is, why does the value side of views impact
>> performance so greatly?
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>  last I checked, there is an Erlang term() <-> JSON <-> Erlang term()
>  conversion for values on the initial view (to/from view server)
>

Not to mention the JSON -> JS_Object* -> JSON inside of couchjs though
that's probably quite a bit quicker :D

> ~M
>
>>
>> kowsik wrote:
>>> I would highly recommend that you do emit(doc.field, null) so that the
>>> key space doesn't get unwieldy and large. Since the id of the document
>>> is part of the map results, you can always fetch it using
>>> include_docs=true.
>>>
>>> K.
>>>
>>> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar
>>> <manjunath_somashekhar@yahoo.com> wrote:
>>>> hi All,
>>>>
>>>> We have been using couchdb (built out of trunk) for prototyping an idea and
would like to thank and congratulate you folks for a simple and usable schema free db.
>>>>
>>>> We plan to store few million documents in couchdb and we would like to create
couple of views to fetch the data appropriately. We have inserted a million documents (each
containing about 20 fields). We are indexing/creating a view on a particular field of the
document. The map function of the view is simple straight forward emit (emit(doc.field, doc)).
It takes about 90 mins to build the required B-Tree index the first time. All the subsequent
queries are performing extremely well (milli second responses). Can anything be done to reduce
the 90 mins taken to build the required B-Tree index the first time?
>>>>
>>>> Environment details:
>>>> Couchdb - 0.9.0a757326
>>>> Erlang - 5.6.5
>>>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686
GNU/Linux
>>>> Ubuntu distribution
>>>> Centrino Dual core, 4GB RAM laptop
>>>>
>>>> Thanks
>>>> Manju
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Jason Smith
>> Proven Corporation
>> Bangkok, Thailand
>> http://www.proven-corporation.com
>
> --
> Michael McDaniel
> Portland, Oregon, USA
> http://trip.autosys.us
> http://autosys.us
> http://mmcdaniel.com/erlview
>
>

Mime
View raw message