incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Chris Anderson <jch...@apache.org>
Subject Re: Scalability of _changes api?
Date Sat, 07 Aug 2010 19:17:37 GMT

On Aug 7, 2010, at 10:39 AM, J Chris Anderson wrote:

> 
> On Aug 7, 2010, at 4:37 AM, Sivan Greenberg wrote:
> 
>> As I am also using a JS filter and fear the performance and load
>> consequences, how does one go about writing an erlang filter?
>> 
> 
> The Erlang filter is gonna be the most efficient option. First create a design doc with
language == "erlang"
> 
> then write your filter like the "show" here, but it returns true or false.
> 
> http://github.com/apache/couchdb/blob/trunk/share/www/script/test/erlang_views.js#L57
> 
> another example from this (fixed) bug report.
> 
> https://issues.apache.org/jira/browse/COUCHDB-740
> 

Found a better example of an Erlang changes filter here:

http://github.com/apache/couchdb/blob/trunk/share/www/script/test/changes.js#L374

> Chris
> 
> 
>> -SIvan
>> 
>> On Fri, Aug 6, 2010 at 11:55 PM, J Chris Anderson <jchris@apache.org> wrote:
>>> 
>>> On Aug 6, 2010, at 1:38 PM, Talib Sharif wrote:
>>> 
>>>> Hey All,
>>>> 
>>>> Do people have experience with the scalability and performance of the _changes
api in general, and especially when using with filters?
>>>> 
>>>> How many connections can be kept open?
>>>> 
>>> 
>>> If you use a JavaScript filter, you will have more limited concurrency that with
an Erlang filter, as the JS filters run in their own OS process. CouchDB tries to be reasonably
efficient with these, but they are still much more heavyweight than the Erlang ones.
>>> 
>>>> And is the changes api function of size/updates/total_no_documents?
>>>> 
>>> 
>>> I think the _changes API should have no scalability issues, as the wall you will
hit long before running an (Erlang) changes filter will be the insert / update rate of the
database itself.
>>> 
>>> If you were to say, make thousands of concurrent changes requests, with varying
since=Seq params, that would be the worst case, so you can test that work load if you want
to find boundaries conditions. (Bottleneck here would be for disk IO reads I think).
>>> 
>>> Chris
>>> 
>>>> Thanks,
>>>> Talib
>>> 
>>> 
> 


Mime
View raw message