couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Talib Sharif <tsha...@mymedify.com>
Subject Re: Some stats about couch DB
Date Sat, 24 Jul 2010 05:51:22 GMT
Thanks Chris,

This is extremely helpful.

-Talib

On Jul 23, 2010, at 6:42 PM, J Chris Anderson wrote:

>
> On Jul 23, 2010, at 5:01 PM, Talib Sharif wrote:
>
>> Hi All,
>>
>> As I am playing more and more with couchdb (it is relaxing and  
>> fun), i just am trying to understand the limits and the  
>> expectations in large production system environment.
>>
>> Right now i have about 100K documents and i have about 10 different  
>> views, one of the view generates does about 100 emits per document.
>>
>> As i am building the view indexes, it is taking about 7-8 hours of  
>> time.
>>
>
> this is about right for 10 million rows. That works out to about 350  
> rows per second (maybe more depending on what your other view are  
> doing), which is a bit slower than I'm used to seeing, but it  
> depends on the size of your emitted keys and values. If you can  
> shrink the keys or the values you should see some speedup (marginal,  
> not an order of magnitude).
>
> because view generation is incremental, in production the 7-8 hours  
> isn't the big issue, it's whether view generation can keep up with  
> the insert rate. So if you are generating less than a few documents  
> per second (x 100 emitted rows) then you should be able to keep the  
> indexes current. If the indexes start to fall behind I'd suggest  
> either upgrading hardware or moving to a clustered solution like  
> CouchDB-Lounge.
>
> for purposes of prototyping you will probably be happier working on  
> a subset of the documents.
>
>
>> I would like to know is that how are other people using it?
>> Is 7-8 or even 24 hours of checkpointing view generation typical?
>> How many documents do people have??
>> How is other people's experience in genereting a view on lets say 1  
>> MIllion documents.
>>
>> I have switched to the native _sum function for reduce. What else  
>> is taking long? Is it the map function written in JavaScript? Is it  
>> the Index that's getting too big?
>>
>
>
> using an Erlang view function could potentially speed things up (but  
> my guess is you are more likely disk-io bound, not CPU bound, so  
> maybe it won't make much difference.)
>
>
>> Is the view generation linear or does it gets worse when you have  
>> more documents?
>>
>
>
> the btree should get slower at roughly O(log n) where n is the  
> number of rows. The base of the log is pretty big, too. Once you get  
> up to the billion-rows territory you'll probably want to look more  
> closely at CouchDB Lounge or the Cloudant clustering.
>
>> I would extremely appreciate help in answering or discussing these  
>> questions.
>>
>> Thanks in advance,
>> Talib
>>
>


Mime
View raw message