incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: How fast do CouchDB propagate changes to other nodes?
Date Fri, 17 Dec 2010 23:07:19 GMT
How fast:
How fast is almost meaningless to ask, since it depends a lot on
what's between CouchDB and your chat clients.

After a change is written to the database, the internal change
listener will get the update almost immediately. From there it's
pushed down the ?feed=continuous long poll to a _changes consumer
(e.g. another couch doing pull replication or a client) or two http
requests (usually with keep-alive) to a push replication destination
CouchDB.

For *one* pull replication or _changes hop it is (at best, feed is
up-to-date, consumer is waiting for the next entry from the server)
the time for the producer (couchdb) and consumer (couchdb, client,
etc) to (de)-serialize and send/receive one line of JSON text. Nothing
more. This can be really fast.

Should you use CouchDB?

Let's assume this project gets interesting and you need multiple nodes
like you described. You could partition your clients between CouchDB
nodes using a consistent hash on a normalized name of the user to
divide up the resources of a cluster. You would then filter the
replications such that each Couch only receives messages intended for
its connected users.

The biggest hurdle here is checkpointing. Since replication needs to
know where to begin if it's restarted, you need to create a
replication topology or strategy that is both resilient to network
outages and doesn't require checking the entire chat history of
everything should you need to change your replication pattern (in
response to failure, scaling, reconfiguration, etc).

If I were doing it this way I would maybe keep and "inbox" and
"outbox" database on every node. You could even name outbox something
like "ramdisk/outbox" and mount a RAM disk as "ramdisk" in the CouchDB
storage directory so that "outbox.couch" gets stored in there. When
your clients send messages you could store them in the outbox and
trust that when they arrive at the right "inbox" on some CouchDB they
will be persisted there. You could even round robin through many
outboxes, or have one per hour or so. This keeps your storage down and
opens up the interesting replication patterns for pushing messages
through a redundantly connected graph of Couches without building up a
massive database that will be hard to replicate (except the inboxes at
the edges).

Using CouchDB for a chat server is an interesting idea, but I don't
know of anyone using CouchDB for replication that is this 'gossipy'. I
think BigCouch might do some every-to-every node replication for
keeping cluster information and database metadata up to date around
the cluster, but that information tends to be small and changes
infrequently.

However, to me this sounds like a lot of work for something that might
be better solved using technologies like zeromq, particularly if
logging all messages is optional.

Anyway, I'm happy to talk about all of this further since I think it's
kind of fascinating. I've been thinking a lot recently about how flood
replication could function efficiently in a dynamic environment, but
it's mostly open questions right now.

I hope that provides some direction and thought guidance. Please let
me know if anything didn't make sense or you have other interesting
ideas or questions. I think it could be made to work, but it's not a
natural fit at scale for the existing replication model at this time.

Cheers,
Randall

On Fri, Dec 17, 2010 at 13:57, Johnny Weng Luu
<johnny.weng.luu@gmail.com> wrote:
> Hi
>
> Im designing a chat app and i thought about this design:
>
> Clients are connected to the nearest couchdb and listening for changes (chat
> texts).
> If one client posts a new message it will be inserted in that client's
> couchdb node.
> The change will be propagated to other couchdb nodes in the cluster.
> The clients connected to those couchdb nodes will get that message.
>
> But this design is heavily dependent on how fast couchdb propagates changes
> to other nodes.
> Is this a good design with couchdb or is it not intended for this design?
>
> How else could you design a chat application with couchdb?
>
> /Johnny
>

Mime
View raw message