incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brad King" <brk...@gmail.com>
Subject Re: view index build time
Date Thu, 03 Jul 2008 13:35:52 GMT
That would be fantastic, but it sounds like other users are seeing
performance similar to what I see. When you say tuning and
optimizations, are you talking about code changes in future versions
of couchdb or parameters we can change now? VM is definitely a
variable. I probably should try this out on real hardware too and
compare.

On Wed, Jul 2, 2008 at 7:30 PM, Damien Katz <damienkatz@gmail.com> wrote:
> This sounds really slow, like somethings wrong. 25 minutes to process 300k
> means ~500 docs sec, or each document takes 2ms. That's a really long time
> CPU wise.
>
> Assuming it's not another VM bug, we should be able about to get that down
> to under minute with some tuning, and probably closer to 10 secs after
> serious optimizations.
>
> -Damien
>
>
> On Jul 2, 2008, at 6:28 PM, Chris Anderson wrote:
>
>> On Wed, Jul 2, 2008 at 3:08 PM, Paul Davis <paul.joseph.davis@gmail.com>
>> wrote:
>>>
>>> I'd have to go back and double check, but off the top of my head 25
>>> min for 300K docs seems about like what I was getting. Ie, not orders
>>> of magnitude slower or anything.
>>
>> In my experience, views generate about 1/2 as fast as that, if not
>> more slowly. My views are often quite complex with a lot of internal
>> looping and multiple emits, so that probably explains it. In short,
>> the times you're reporting seem reasonable.
>>
>> The bottleneck (based on my extremely unscientific use of top) doesn't
>> seem to be the view server, but rather CouchDB's beam process, which
>> as I understand it, is busy sorting the results as they come back from
>> the view server. So the quickest route to parallelizing this may be to
>> manually partition your data across CouchDB instances, generate the
>> views, and query them in parallel, merging the results in your
>> application.
>>
>> I don't actually plan to do all that work until my insert rate
>> eclipses CouchDB's view generation speed. :)
>>
>> Once upon a time there was a feature to return the available results
>> of a view, even while generation is still occurring. The feature has
>> fallen by the wayside, and it would be non-trivial to turn it back on,
>> according to Damien on IRC. Maybe if it would be useful to enough
>> people, we'll see it again.
>>
>> --
>> Chris Anderson
>> http://jchris.mfdz.com
>
>

Mime
View raw message