couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: reiterating transactions vs. replication
Date Fri, 22 May 2009 21:30:24 GMT
On Thu, May 21, 2009 at 8:30 PM, Yuval Kogman <nothingmuch@woobling.org> wrote:
> 2009/5/21 Adam Kocoloski <kocolosk@apache.org>:
>> Hi Yuval, thanks for this well-written proposal.  I don't really want to
>> rehash all the discussion from back in February (see the thread beginning at
>> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c84F66023-030A-4669-B75C-3DCC92D71A78@yahoo.com%3e for
>> a particularly detailed discussion), but I do want to comment on one aspect.
>>
>> Updating the replicator to be smart about atomic bulk transactions is doable
>> (although a major undertaking), but when you throw DB compaction and
>> revision stemming into the mix things get really hairy.  Recall that CouchDB
>> revisions are used for concurrency control, not for maintaining history.
>>  Consider the following sequence of events:
>>
>> 1) Generate foo/1 and bar/1 in an atomic _bulk_docs operation
>> 2) Update foo -> foo/2
>> Compact the DB (foo/1 is deleted)
>> Start replicating to a mirror
>> Replication crashes before it reaches foo/2
>
> By crash you mean an error due to a conflict between foo/2 and foo/1'
> (the mirror's version of foo), right?
>
>> In your proposal, we should expect foo/1 to exist on the mirror, right?  I
>> think this means we'd need to modify the compaction algorithm to keep
>> revisions of documents if a) the revision was part of an atomic _bulk_docs,
>> and b) any of the documents in that transaction are still at the revision
>> generated by the transaction.  Same thing goes for revision stemming -- we
>> can never drop revisions if they were part of an atomic upload and at least
>> one of the document revs in the upload is still current.
>
> Yep. Personally I see this is a tradeoff, not a limitation per se. If
> you specify 'atomic' then you must pay more in terms of data size,
> performance, etc.

The problem as I see it is that someone else's bulk transaction will
have to sit around in my database, until I edit all the docs in it.
Hopefully I won't get any distributed conflicts on other old versions
of docs in the group because this would put edits that I've done
locally to other documents in the bulk group, somehow less valid.

Distributed bulk transactions would make for chaotic behavior, as
someone's mostly unrelated change on a remote node could eventually
replicate to me (months later) and knock an entire line of work that
I've done into a conflict state.

If you want atomicity, put it in a single document.

Chris

>
> In 0.8 you would have theoretically had to pay by default, but didn't
> because replication broke transactions.
>
> The basic algorithm is still the same, but the garbage collected unit
> is changed (instead of garbage collecting document revisions it
> garbage collects revision sets, with the current case being a set with
> one member. The rules still apply (if this object is wholly shadowed
> by non conflicting changes then it can be disposed of)). IIRC the
> algorithm is a copying garbage collector, so this is pretty easy to do
> (you walk a DAG instead of a linked list).
>
> Under the proposed model you'd choose which operations are
> transactional and will have to pay for those.
>
>
> Anwyay, thanks for your link as well, I was reading through a rather
> boring thread and didn't see this one, so I guess I did miss out. It
> seemed to imply the discussion was done only on IRC.
>
> Anyway, here goes...
>
> The fundamental problem is that any consistent data model needs at the
> very least to have atomic primitives and ordered message passing (with
> transactional message handlers) at the per-partition level, or
> atomicity and consistency is restricted to a single document.
>
> What concerns me is Damien's post
> (http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c451872B8-152C-42A6-9324-DD52534D9A32@apache.org%3e):
>
>> No, CouchDB replication doesn't support replicating the transactions.
>> Never has, never will. That's more like transaction log replication
>> that's in traditonal dbs, a different beast.
>>
>> For the new bulk transaction model, I'm only proposing supporting
>> eventual consistency. All changes are safe to disk, but the db may not
>> be in a consistent state right away.
>
> From what I know this assumption is wrong. Eventual consistency still
> needs atomic primitives, it's not about whether or not you have
> transactions, it's about what data they affect (eventual consistency
> involves breaking them down).
>
> Anyway, "never will" sounds pretty binding, but for the sake of argument:
>
> By using only insertions and idempotent updates for the bulk of the
> data changes and a message queue whose handlers use atomic updates to
> integrate this data one can implement a truly atomic distributed
> model, or an eventual consistency, but without this updates need to be
> restricted to exactly one document.
>
> Eventual consistency is still possible using either locks or by
> breaking down what would have been large distributed transactions into
> smaller ones, but the key is that the code that will make things
> actually consistent must still have ACID guarantees (and be dispatched
> in order).
>
> The 0.9 model CouchDB is effectively MyISAM without data loss, but
> just because the data is around doesn't mean it's possible to know
> what to do with it (loss of context), or even fix it safely (the
> conflict resolution code is susceptible to conflicts too).
>
> Unfortunately for eventual consistency to actually work the breaking
> down of operations must be done on application level, the database
> can't decide which data can be deferred and which data cannot.
>
> All immutable data and all new data can obviously be added to the
> database outside of a transaction, but eventually a transaction
> linking this data must be part of an atomic mutation.
>
> The only way to support this without atomic operations on a unit
> larger than a document, is to have a "master" document for every
> transitive closure the graph structure requiring consistency, which in
> effect only actually relates to immutable snapshot documents (e.g.
> where the ID is a hash of the data). If these closures overlap then a
> single "master" for the whole graph will be needed.
>
>
> To illustrate, let's make up a social networking example. Let's say
> you are adding a friend on this social network, and that this
> operation involves 3 updates, one to add a link from your profile to
> your friend's ID, another for the inverse, and a third update to
> update to send a "hello" message to the friend, updating their inbox.
> The first update lives in one partition, and the second and third
> updates are on a second one.
>
> The back pointers in your new friends must be updated. In an fully
> transactional model this would lock the friend's document and yours at
> the same time, in an eventual consistency model this would queue a
> message for the friend's partition, and a message handler on the
> friend's partition would update this atomically "eventually". It's
> fine for the link to be out of date for a while, but eventually it
> needs to be fixed (e.g. if you want to remove the friend, message
> them, etc).
>
> In couchdb 0.9 one of the writes will get a "conflict" error back, and
> they could refetch the updated version and try the edit again. The
> problem is that if the wrote the third update update to another
> document on the same node making assumptions about the same data, that
> write may have succeeded, leaving the data inconsistent. Under an
> eventual consistency model you still use transactions to do these
> updates, you just must design your model to break them down into
> smaller units.
>
> The reason a graph structure is more susceptible to inconsistency is
> that while in a relational model many data linkage operations can be
> done with a single insert/update (e.g. `insert into edges (node1_id,
> node2_id)`), in a document based database this type of opreation
> involves modifying all the affected documents. The chance of
> inconsistency is increased because contention is higher and there is
> more data that must be synchronized.
>
> However, in another post Damien said:
>
>> Which is why in general you want to avoid inter-document dependencies,
>> or be relaxed in how you deal with them.
>
> So I think I best shut up after this without some decision maker
> telling me not to, if my use case is not covered by the intended
> design then that's that, but I do think this thread sort of covers
> this:
>
>> As far as distributed transactions go, I'd be thrilled if we could
>> implement it and also support the rest of couchdb, like views and bi-
>> directional replication. Please start up a discussion here in dev@
>> about it and see if you can work out a design.
>
> Without going too pie-in-the-sky.
>
> Cheers,
> Yuval
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message