couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Stott <nrst...@gmail.com>
Subject Re: reiterating transactions vs. replication
Date Fri, 22 May 2009 21:45:44 GMT
As a user, when I chose couchdb for my most recent project, I chose it
because I didn't care about transactions.  I would've used RDBMS if that
were important.
I chose it because couch solved the problems I needed solved very well.

I don't think transactions should be a big dev focus.

On Fri, May 22, 2009 at 4:30 PM, Chris Anderson <jchris@apache.org> wrote:

> On Thu, May 21, 2009 at 8:30 PM, Yuval Kogman <nothingmuch@woobling.org>
> wrote:
> > 2009/5/21 Adam Kocoloski <kocolosk@apache.org>:
> >> Hi Yuval, thanks for this well-written proposal.  I don't really want to
> >> rehash all the discussion from back in February (see the thread
> beginning at
> >>
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c84F66023-030A-4669-B75C-3DCC92D71A78@yahoo.com%3e
>  for
> >> a particularly detailed discussion), but I do want to comment on one
> aspect.
> >>
> >> Updating the replicator to be smart about atomic bulk transactions is
> doable
> >> (although a major undertaking), but when you throw DB compaction and
> >> revision stemming into the mix things get really hairy.  Recall that
> CouchDB
> >> revisions are used for concurrency control, not for maintaining history.
> >>  Consider the following sequence of events:
> >>
> >> 1) Generate foo/1 and bar/1 in an atomic _bulk_docs operation
> >> 2) Update foo -> foo/2
> >> Compact the DB (foo/1 is deleted)
> >> Start replicating to a mirror
> >> Replication crashes before it reaches foo/2
> >
> > By crash you mean an error due to a conflict between foo/2 and foo/1'
> > (the mirror's version of foo), right?
> >
> >> In your proposal, we should expect foo/1 to exist on the mirror, right?
>  I
> >> think this means we'd need to modify the compaction algorithm to keep
> >> revisions of documents if a) the revision was part of an atomic
> _bulk_docs,
> >> and b) any of the documents in that transaction are still at the
> revision
> >> generated by the transaction.  Same thing goes for revision stemming --
> we
> >> can never drop revisions if they were part of an atomic upload and at
> least
> >> one of the document revs in the upload is still current.
> >
> > Yep. Personally I see this is a tradeoff, not a limitation per se. If
> > you specify 'atomic' then you must pay more in terms of data size,
> > performance, etc.
>
> The problem as I see it is that someone else's bulk transaction will
> have to sit around in my database, until I edit all the docs in it.
> Hopefully I won't get any distributed conflicts on other old versions
> of docs in the group because this would put edits that I've done
> locally to other documents in the bulk group, somehow less valid.
>
> Distributed bulk transactions would make for chaotic behavior, as
> someone's mostly unrelated change on a remote node could eventually
> replicate to me (months later) and knock an entire line of work that
> I've done into a conflict state.
>
> If you want atomicity, put it in a single document.
>
> Chris
>
> >
> > In 0.8 you would have theoretically had to pay by default, but didn't
> > because replication broke transactions.
> >
> > The basic algorithm is still the same, but the garbage collected unit
> > is changed (instead of garbage collecting document revisions it
> > garbage collects revision sets, with the current case being a set with
> > one member. The rules still apply (if this object is wholly shadowed
> > by non conflicting changes then it can be disposed of)). IIRC the
> > algorithm is a copying garbage collector, so this is pretty easy to do
> > (you walk a DAG instead of a linked list).
> >
> > Under the proposed model you'd choose which operations are
> > transactional and will have to pay for those.
> >
> >
> > Anwyay, thanks for your link as well, I was reading through a rather
> > boring thread and didn't see this one, so I guess I did miss out. It
> > seemed to imply the discussion was done only on IRC.
> >
> > Anyway, here goes...
> >
> > The fundamental problem is that any consistent data model needs at the
> > very least to have atomic primitives and ordered message passing (with
> > transactional message handlers) at the per-partition level, or
> > atomicity and consistency is restricted to a single document.
> >
> > What concerns me is Damien's post
> > (
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c451872B8-152C-42A6-9324-DD52534D9A32@apache.org%3e
> ):
> >
> >> No, CouchDB replication doesn't support replicating the transactions.
> >> Never has, never will. That's more like transaction log replication
> >> that's in traditonal dbs, a different beast.
> >>
> >> For the new bulk transaction model, I'm only proposing supporting
> >> eventual consistency. All changes are safe to disk, but the db may not
> >> be in a consistent state right away.
> >
> > From what I know this assumption is wrong. Eventual consistency still
> > needs atomic primitives, it's not about whether or not you have
> > transactions, it's about what data they affect (eventual consistency
> > involves breaking them down).
> >
> > Anyway, "never will" sounds pretty binding, but for the sake of argument:
> >
> > By using only insertions and idempotent updates for the bulk of the
> > data changes and a message queue whose handlers use atomic updates to
> > integrate this data one can implement a truly atomic distributed
> > model, or an eventual consistency, but without this updates need to be
> > restricted to exactly one document.
> >
> > Eventual consistency is still possible using either locks or by
> > breaking down what would have been large distributed transactions into
> > smaller ones, but the key is that the code that will make things
> > actually consistent must still have ACID guarantees (and be dispatched
> > in order).
> >
> > The 0.9 model CouchDB is effectively MyISAM without data loss, but
> > just because the data is around doesn't mean it's possible to know
> > what to do with it (loss of context), or even fix it safely (the
> > conflict resolution code is susceptible to conflicts too).
> >
> > Unfortunately for eventual consistency to actually work the breaking
> > down of operations must be done on application level, the database
> > can't decide which data can be deferred and which data cannot.
> >
> > All immutable data and all new data can obviously be added to the
> > database outside of a transaction, but eventually a transaction
> > linking this data must be part of an atomic mutation.
> >
> > The only way to support this without atomic operations on a unit
> > larger than a document, is to have a "master" document for every
> > transitive closure the graph structure requiring consistency, which in
> > effect only actually relates to immutable snapshot documents (e.g.
> > where the ID is a hash of the data). If these closures overlap then a
> > single "master" for the whole graph will be needed.
> >
> >
> > To illustrate, let's make up a social networking example. Let's say
> > you are adding a friend on this social network, and that this
> > operation involves 3 updates, one to add a link from your profile to
> > your friend's ID, another for the inverse, and a third update to
> > update to send a "hello" message to the friend, updating their inbox.
> > The first update lives in one partition, and the second and third
> > updates are on a second one.
> >
> > The back pointers in your new friends must be updated. In an fully
> > transactional model this would lock the friend's document and yours at
> > the same time, in an eventual consistency model this would queue a
> > message for the friend's partition, and a message handler on the
> > friend's partition would update this atomically "eventually". It's
> > fine for the link to be out of date for a while, but eventually it
> > needs to be fixed (e.g. if you want to remove the friend, message
> > them, etc).
> >
> > In couchdb 0.9 one of the writes will get a "conflict" error back, and
> > they could refetch the updated version and try the edit again. The
> > problem is that if the wrote the third update update to another
> > document on the same node making assumptions about the same data, that
> > write may have succeeded, leaving the data inconsistent. Under an
> > eventual consistency model you still use transactions to do these
> > updates, you just must design your model to break them down into
> > smaller units.
> >
> > The reason a graph structure is more susceptible to inconsistency is
> > that while in a relational model many data linkage operations can be
> > done with a single insert/update (e.g. `insert into edges (node1_id,
> > node2_id)`), in a document based database this type of opreation
> > involves modifying all the affected documents. The chance of
> > inconsistency is increased because contention is higher and there is
> > more data that must be synchronized.
> >
> > However, in another post Damien said:
> >
> >> Which is why in general you want to avoid inter-document dependencies,
> >> or be relaxed in how you deal with them.
> >
> > So I think I best shut up after this without some decision maker
> > telling me not to, if my use case is not covered by the intended
> > design then that's that, but I do think this thread sort of covers
> > this:
> >
> >> As far as distributed transactions go, I'd be thrilled if we could
> >> implement it and also support the rest of couchdb, like views and bi-
> >> directional replication. Please start up a discussion here in dev@
> >> about it and see if you can work out a design.
> >
> > Without going too pie-in-the-sky.
> >
> > Cheers,
> > Yuval
> >
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message