incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Moritz <>
Subject Re: performance issues
Date Mon, 05 Apr 2010 19:19:36 GMT
Hi J Chris,

J Chris Anderson schrieb:
> On Apr 5, 2010, at 11:52 AM, Julian Moritz wrote:
>> Hi,
>> Julian Moritz schrieb:
>>> Hi,
>> I've just found this via google:
>>>> We don't parallelize view index creation yet, so this is not an
>>>> additional problem for you. You can however build two views in
>>>> parallel and make use of two cores that way.
>> If this is (still) true, view index creation is the bottleneck of my
>> application. Hence I'm just playing around and yet using 100% of my
>> core, I cannot use CouchDB anymore.
> We rarely see view generation that is actually limited by view-function execution speed.
The majority of the time the actual bottleneck is disk IO. To parallelize view generation
the best option is to run a CouchDB-Lounge cluster.

Hm, at the moment I have access to two computers. This isn't what you
mean with a "couchdb-lounge cluster", right?

> It looks like you might be better of removing your reduce function, which might also
speed things up.

But I need it for making my list unique. This is an important feature
for my application.

Thanks, I'll think about how to set up a couchdb cluster and do more


> Chris
>> Regards
>> Julian
>>> I've developed a (in my eyes) simple view. I have a wordlist which does
>>> not  contain unique words. I want to store it in a view, with every word
>>> occurring once and ordered by random. Therefore I have a simple view
>>> function:
>>> function(doc){
>>> emit([hash(doc.word), doc.word], null);
>>> }
>>> and a simple reduce:
>>> function(key, values, rereduce){
>>> return true;
>>> }
>>> calling that view with group=true it does what I want.
>>> When storing plenty of words to the database, one of my two cpu cores is
>>> used completely by couchjs.
>>> Isn't the view built using two (or all) cpu cores? I thought (obviously
>>> I'm wrong) that it would be calculated in parallel and using a
>>> quadro-core (or more cores) would make storing faster.
>>> Is there a solution for that? Should I use another query-server?
>>> Regards
>>> Julian

View raw message