couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henrik Thostrup Jensen <thost...@gmail.com>
Subject Performance Regression for view generation in 0.11
Date Fri, 12 Mar 2010 10:45:23 GMT
Hi [resend, but it didn't appear as i was unsubscribed ]

I recently did some performance test of different _id sizes [1], including a
comparison between stock 0.10 and a 0.11 snapshot. Unexpectedly, 0.11 turned
out to be slower for view generation, despite the work done with COUCHDB-495.
I've poked a bit at the problem, and I think the reason (or a part of it), is
the increased number of fsync call during view generation for checkpointing. In
0.11, checkpointing is done every second, where it is done around 10-15 seconds
in 0.10. This increases the disk load and creates large B-trees due to
extensive shadowing (as least that is what I observe).

Is it possible to configure this interval? Delayed commits is not what i want,
as i want insertions to fsync. I tried changing the three 1000 constants in the
send_after delays in couch_view_group.erl, but checkpointing was still done
each second (I'm not to strong on Erlang, so I had a bit of problem grokking
the code). In the case of a crash this will only cause a certain part of the db
to be scanned for the view, so IMHO the checkpointing interval could easily be
set to 60 seconds or so.

There are a lot of guessing in the above, and I'm not sure of everything.
However as we are accumulating more data the time view generation is starting
to hurt when creating new views, so I'd like to see CouchDB become faster at
this (we expect to do some _id field reducement which should help a bit as
well, but it won't buy us that much).

Could there be any other explanations for the performance regressions?

Please CC any replies, as I am not subscribed.


 best regards, Henrik


1. http://mail-archives.apache.org/mod_mbox/couchdb-user/201003.mbox/%3Cf33a4da21003040541y4e59ae35pb

Mime
View raw message