couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <>
Subject Re: reiterating transactions vs. replication
Date Fri, 22 May 2009 18:30:29 GMT
On Fri, May 22, 2009 at 12:27, Randall Leeds <>wrote:
> Since I do like the component model, I'm planning to set up a github
> project to play with some consensus protocols and overlay networks in
> Erlang. Hopefully once I start doing that I'll start to see the places that
> CouchDB can hook into it and get a nice, clean, flexible API. I see the
> problem broken into several tiers.
> Transactional Bulk Docs (this is the wishlist and challenge, but has to
> rest on the below)
> Sharding/Replication (_seq consensus / possibly consistent hashing or other
> distributed, deterministic data structure mapping BTree nodes to servers
> [2])
> Communication (either Erlang or a tcp with pluggable overlay-network for
> routing)

A revised break-down should be something like:

Transactional Bulk-Docs
Single-Doc Multi-Replica Transactions
Replication / Sharding


Transactional Bulk-Docs (Server pre-prepares itself as leader for a special
bulk round)
Single-Doc Multi-Replica Transactions (Simple consensus. Special leader for
bulk case. Pre-determined leader normally.)
Replication / Sharding (Any sort of load-balancing, slicing, or static
Network (Chord and derivatives (Scalaris uses Chord #), Tapestry, Pastry,

I think with the right configurations and components transactional bulk-docs
are just a special case of single-doc transactions. For example, in case the
single-doc layer optimizes for less communication rounds by pre-selecting
leaders on a rotating basis a bulk transaction just involves revoking all
nodes for a sequence number consensus round and using an extra round trip to
"take over" the leader position. Then all nodes holding replicas of all
documents involved would have to participate in this new round (or at least
a majority of replicas). Having 'atomic=false' could skip this expense and
make a best-effort serial execution of the updates and fail on conflict.

Just trying to keep the conversation rolling. But I understand we have to
hit the code soon if this really stands to go somewhere.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message