incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Foshay <bill.fos...@noteandgo.com>
Subject Re: Scaling with filtered replication
Date Tue, 09 Jul 2013 20:17:19 GMT
Jens Alfke <jens@...> writes:
> 
> Yes, I agree that CouchDB filtering is not significantly higher-CPU than 
not filtering :) and likely
> cheaper if you include the savings from not transmitting the filtered-out 
revisions.
> 
> But if you _do_ filter heavily, so any one client is seeing only a small 
fraction of the total update traffic,
> the filtering overhead starts to dominate as the number of clients grows. 
Because the server is still
> fetching, decoding and running a JS function on (say) 100 or 1000 rejected 
documents for every one that
> does get sent. That’s a pretty typical scenario for a system with mobile 
or desktop clients — think of
> Exchange or SalesForce.com or Words With Friends; what fraction of the 
total server-side updates does
> any one client see?
> 
> The alternative is the hypothetical view-based filtering that’s been 
talked about here before, where
> the source db would iterate over a pre-filtered list of revisions from a 
view index rather than going
> through the entire by-sequence index. Or the actual-but-alpha-quality 
“channels” mechanism
> we’re using in the Couchbase Sync Gateway.
> 
> Anyway. I’m not meaning to harsh on filtering in general, and in the OP’s 
case it sounds like the target
> databases are corporate customers rather than end-users, so there probably 
aren’t nearly as many of
> them as in the scenarios I’m talking about.
> 
> —Jens
> 

Sorry, I phrased the question poorly. That makes sense how n^2 replications 
would present the same scaling problem as having filtered replications. My 
concern is more along the lines of what Jens is describing. We do have 
mobile and desktop clients and I'm worried that in the future, the filtering 
overhead on all of the server side updates will present them with 
performance problems when they're only concerned with a small fraction of 
the updates. Are there any other approaches we could take aside from the 
hypothetical view-based filtering Jens described? 

Thanks,
Bill


Mime
View raw message