couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Chesneau <bchesn...@gmail.com>
Subject Re: Scaling with filtered replication
Date Wed, 10 Jul 2013 04:46:30 GMT
On Jul 9, 2013 8:43 PM, "Jens Alfke" <jens@couchbase.com> wrote:
>
>
> On Jul 9, 2013, at 11:09 AM, Robert Newson <rnewson@apache.org> wrote:
>
> > If you didn’t have filters at all, but still had n^2 replications,
you've still got a scaling problem, it's just not directly related to the
filtering overhead.
>
> Yes, I agree that CouchDB filtering is not significantly higher-CPU than
not filtering :) and likely cheaper if you include the savings from not
transmitting the filtered-out revisions.
>
> But if you _do_ filter heavily, so any one client is seeing only a small
fraction of the total update traffic, the filtering overhead starts to
dominate as the number of clients grows. Because the server is still
fetching, decoding and running a JS function on (say) 100 or 1000 rejected
documents for every one that does get sent. That’s a pretty typical
scenario for a system with mobile or desktop clients — think of Exchange or
SalesForce.com or Words With Friends; what fraction of the total
server-side updates does any one client see?
>
> The alternative is the hypothetical view-based filtering that’s been
talked about here before, where the source db would iterate over a
pre-filtered list of revisions from a view index rather than going through
the entire by-sequence index. Or the actual-but-alpha-quality “channels”
mechanism we’re using in the Couchbase Sync Gateway.
>

it is not hypothetical. the view based replication or changes is actually
available in rcouch (http://rcouch.org) and used in prod. It will be merged
in couchdb asap. (the ip should be solved this month).
In the mean time you can use rcouch which is based on couchdb 1.3 .

More info here:

https://github.com/refuge/rcouch/wiki/View-Changes

https://github.com/refuge/rcouch/wiki/Replication-with-view-changes

Compared to the sync gateway in go from couchbase, it is not hacking around
views, it create a plain index of changes at the sametime the view is
indexed. Also it does'nt introduce new concepts like the channels.

The js evaluation against a lot of documents or with many requests can be
really slow. espcially when you start a replication on a large database.
This initial replication can take a long time. This is why the view changes
has been added in rcouch.

benoit.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message