couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Half-baked idea: incremental virtual databases
Date Mon, 04 Feb 2013 22:50:43 GMT
I had a mind to teach the _replicator db this trick. Since we have a
record of everything we need to resume a replication there's no reason
for a one-to-one correspondence between a _replicator doc and a
replicator process. We can simply run N of them for a bit (say, a
batch of 1000 updates) and then switch to others. The internal
db_updated mechanism is a good way to notice when we might have
updates worth sending but it's only half the story. A round-robin over
all _replicator docs (other than one-shot ones, of course) seems a
really neat trick to me.

B.

On 4 February 2013 22:39, Jan Lehnardt <jan@apache.org> wrote:
>
> On Feb 4, 2013, at 23:14 , Nathan Vander Wilt <natevw@gmail.com> wrote:
>
>> On Jan 29, 2013, at 5:53 PM, Nathan Vander Wilt wrote:
>>> So I've heard from both hosting providers that it is fine, but also managed to
take both of their shared services "down" with only about ~100 users (200 continuous filtered
replications). I'm only now at the point where I have tooling to build out arbitrary large
tests on my local machine to see the stats for myself, but as I understand it the issue is
that every replication needs at least one couchjs process to do its filtering for it.
>>>
>>> So rather than inactive users mostly just taking up disk space, they're instead
costing a full-fledged process worth of memory and system resources, each, all the time. As
I understand it, this isn't much better on BigCouch either since the data is scattered ±
evenly on each machine, so while the *computation* is spread, each node in the cluster still
needs k*numberOfUsers couchjs processes running. So it's "scalable" in the sense that traditional
databases are scalable: vertically, by buying machines with more and more memory.
>>>
>>> Again, I am still working on getting a better feel for the costs involved, but
the basic theory with a master-to-N hub is not a great start: every change needs to be processed
by every N replications. So if a user writes a document that ends up in the master database,
every other user's filter function needs to process that change coming out of master. Even
when N users are generating 0 (instead of M) changes, it's not doing M*N work but there's
still always 2*N open connections and supporting processes providing a nasty baseline for
large values of N.
>>
>> Looks like I was wrong about needing enough RAM for one couchjs process per replication.
>>
>> CouchDB maintains a pool of (no more than query_server_config/os_process_limit) couchjs
processes and work is divvied out amongst these as necessary. I found a little meta-discussion
of this system at https://issues.apache.org/jira/browse/COUCHDB-1375 and the code uses it
here https://github.com/apache/couchdb/blob/master/src/couchdb/couch_query_servers.erl#L299
>>
>> On my laptop, I was able to spin up 250 users without issue. Beyond that, I start
running into ± hardcoded system resource limits that Erlang has under Mac OS X but from what
I've seen the only theoretical scalability issue with going beyond that on Linux/Windows would
be response times, as the worker processes become more and more saturated.
>>
>> It still seems wise to implement tiered replications for communicating between thousands
of *active* user databases, but that seems reasonable to me.
>
> CouchDB’s design is obviously lacking here.
>
> For immediate relief, I’ll propose the usual jackhammer of unpopular responses: write
your filters in Erlang. (sorry :)
>
> For the future: we already see progress in improving the view server situation. Once
we get to a more desirable setup (yaynode/v8), we can improve the view server communication,
there is no reason you’d need a single JS OS process per active replication and we should
absolutely fix that.
>
> --
>
> Another angle is the replicator. I know Jason Smith has a prototype of this in Node,
it works. Instead of maintaining N active replications, we just keep a maximum number of active
connections and cycle out ones that are currently inactive. The DbUpdateNotification mechanism
should make this relatively straightforward. There is added overhead for setting up and tearing
down replications, but we can make better use of resources and not clog things with inactive
replications. Especially in a db-per-user scenario, most replications don’t see much of
an update most of the time, they should be inactive until data is written to any of the source
databases. The mechanics in CouchDB are all there for this, we just need to write it.
>
> --
>
> Nate, thanks for sharing our findings and for bearing with us, despite your very understandable
frustrations. It is people like you that allow us to make CouchDB better!
>
> Best
> Jan
> --
>
>
>

Mime
View raw message