couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma" <>
Subject Re: Several newbie CouchDB questions.
Date Wed, 24 Feb 2010 20:10:38 GMT
Actually, on some level it does deal with node failure and cluster changes.

Failures are being handled gracefully. Once you have decent sharded
cluster installed, you can actually shut nodes down (as if it's a failure)
and keep it running. I have a test setup with 4 virtual machines, each
running dumb- and smartproxy. It has been sharded on a level that allows
me to shut half the cluster down while pulling data from it using Siege or
ApacheBench; everthing just goes a bit slower. The only thing you need to
keep in mind is that the node you use for access (in my case all 4 grant
access) isn't down; but that can be remedied.

The only thing that can fail during reads is a view that needs to
aggregate data from the nodes. The total resultset can be smaller then
anticipated is a node fails during that process. The final resultset won't
be corrupted though.

Pushing data to the cluster while, for instance, one node is down is a bit
more complicated because you really need to replicate the changes made to
the sharded databases back to the dead node. This must be done manually
before it joins the cluster again. Anyway, it would be a nice feature if
the cluster can repopulate a dead node automatically if it goes up again.

Dealing with cluster changes is a challenge. Adding more nodes to the
cluster is quite easy but reducing is very complicated because it was
already sharded. At this moment, you would need to pull the data from the
cluster, reconfigure the shardmap to fit a reduced cluster, and populate
it again. But beware, growing the cluster will be a tough job if you
haven't given it enough thought up front. By oversharding the cluster,
growth can be accomodated easily - it's just a matter of pointing shards
to another node and copying those sharded databases to the new node. Well,
it isn't that easy but shouldn't give you a headache.

Although we aren't using it in production, we will someday. Perhaps the
lounge developers and production users can say something about their
experience and feature requests.

Time Less said:
> I've looked over what little there is, and it appears to me Lounge
> doesn't deal with node failure or cluster size changes (ie:
> adding/subtracting nodes in the cluster). It looks like it's merely two
> components for distributing reads/writes and giving some map/reduce
> functionality.
> --
> timeless(ness)

View raw message