couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Cottlehuber <d...@muse.net.nz>
Subject Re: view index generation for single document and bulk documents
Date Thu, 16 Jun 2011 08:27:07 GMT
On 16 June 2011 10:42, Yogesh Khambia <ykhambia@gmail.com> wrote:
> Hi all,
>
> Currently, I am doing performance tests with database in CouchDB 1.0.1,
> where a script is continuously writing single documents to the CouchDB
> database.
> I had the issue of the user being penalized for reading the view in CouchDB
> database, by updating the view indexes.To improve on the latency for the
> view index generation, I wrote a function which generates the view indexes
> for each new update made to the database.
> The latency for view index creation has been improved. However, I read from
> the FAQ on the CouchDB wiki that "The reason not to integrate each doc as it
> comes in is that it is horribly inefficient and CouchDB is designed to do
> view index updates very fast, so batching is a good idea."
> It will be really helpful if somebody can answer me on following:
>
> - If  view index generation for each new single document insert is not a
> good approach?

Hi Yogesh

I think the wiki is pretty clear "it is horribly inefficient".

The reason is that for both docs and views, separate B-tree DB files
need to be updated. If you bulk-load docs this allows couch to do the
b-tree balancing all at once for that bulk-load, rather than each time
per doc. This is a lot more efficient.
http://horicky.blogspot.com/2008/10/couchdb-implementation.html covers
well how this works under the hood.

> - How does the single document insert and bulk document insert affect the
> CouchDB view generation, where:
>
>   1.  a daemon script updates view index for each new document insert.
>   2.  a daemon script updates view index for bulk document

IIRC all views in the same ddoc block while couchdb updates it. Other
than that the same points above apply; however this will vary heavily,
especially how your doc ids are sequenced and what your views output,
hardware, OS, etc etc. Perhaps worth doing some benchmarking and/or
providing more info on your use case.

Somebody more familiar with the code may be able to add more if you need it.

A+
Dave

Mime
View raw message