couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: [DISCUSS] : things we need to solve/decide : changes feed
Date Tue, 19 Mar 2019 02:47:53 GMT

> On Mar 18, 2019, at 9:03 PM, Alex Miller <alexmiller@apple.com.INVALID> wrote:
> 
> 
>> On Mar 5, 2019, at 4:04 PM, Adam Kocoloski <kocolosk@apache.org> wrote:
>> With the incarnation and branch count in place we’d be looking at a design where
the KV pairs have the structure
>> 
>> (“changes”, Incarnation, Versionstamp) = (ValFomat, DocID, RevFormat, RevPosition,
RevHash, BranchCount)
>> 
>> where ValFormat is an enumeration enabling schema evolution of the value format in
the future, and RevFormat, RevPosition, RevHash are associated with the winning edit branch
for the document (not necessarily the edit that occurred at this version, matching current
CouchDB behavior) and carry the meanings defined in the revision storage RFC[2].
> 
> 
> 
> Do note that with versionstamped keys, and atomic operations in general, it’s important
to keep in mind that committing a transaction might return `commit_unknown_result`.  Transaction
loops will retry a `commit_unknown_result` error by default.  (Or, will, if your erlang/elixer
bindings copy the behavior of the rest of the bindings.)  So you’ll need some way of making
an insert into `changes` an idempotent operation.
> 
> 
> I’ll volunteer three possible options:
> 
> 1. The easiest case is if you happen to be inserting a known, fixed key (and preferably
one that contains a versionstamped value) in the same transaction as a versionstamped key,
as then you have a key to check in your database to tell if your commit happened or not.
> 
> 2. If you’re doing an insert of just this key in a transaction, and your key space
has relatively infrequent writes, then you might be able to get away with remembering the
initial read version of your transaction, and issue a range scan from (“changes”, Incarnation,
InitiailReadVersion) -> (“changes”, infinity, infinity), and filter through looking
for a value equal to what you tried to write.
> 
> 3. Accept that you might write duplicate values at different versionstamped keys, and
write your client code such that it will skip repeated values that it has already seen.
> 
> I had filed an internal bug long ago to complain about this before, which I’ve now
copied over to GitHub[1].  So if this becomes absurdly difficult to work around, feel free
to show up there to complain.
> 
> [1]: https://github.com/apple/foundationdb/issues/1321 <https://github.com/apple/foundationdb/issues/1321>

Hi Alex, thanks for that comment and for taking a close read. Option 1 could almost work here;
we will be inserting up to two keys in a “revisions” subspace as part of the same transaction
that we could read and that would include both the RevHash and the Versionstamp. The latest
design for that subspace is here:

https://github.com/apache/couchdb-documentation/blob/5197cdffe1e2c08a7640dd646dd02909c0cf51ef/rfcs/001-fdb-revision-metadata-model.md

If I understand correctly, I think the edge case regarding `commit_unknown_result` that we’re
not adequately guarding against is the following series of events:

1) Txn A tries to commit an edit and gets `commit_unknown_result`; in reality, the transaction
failed
2) Txn B tries to commit an *identical* edit (save for the versionstamp) and succeeds
3) Txn A retries and finds the entry in “revisions” for this `RevHash` exists and the
`Versionstamp` in “changes” for this DocID higher than the one initially attempted

In this scenario we should report an edit conflict failure back to the client for Txn A, but
the end result is indistinguishable from the case where 

1) Txn A tries to commit an edit and gets `commit_unknown_result`; in reality, the transaction
*succeeds*
2) Txn B tries to edit a *different* branch of the document and succeeds (thereby replacing
Txn A’s entry in “changes”)

which is a scenario where we need to report success for both Txn A and Txn B.

We could close this loophole by storing the Versionstamp alongside the RevHash for every edit
in the “revisions” subspace, rather than only storing the Versionstamp of the latest edit
to the document. Not cheap though. Will give it some thought. Thanks!

Adam


Mime
View raw message