couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Samuel Newson <rnew...@apache.org>
Subject Re: [DISCUSS] couchdb 4.0 transactional semantics
Date Thu, 16 Jul 2020 18:24:17 GMT

Agreed on all 4 points. On the final point, it's worth noting that a continuous changes feed
was two-phase, the first is indeed over a snapshot of the db as of the start of the _changes
request, the second phase is an endless series of subsequent snapshots. the 4.0 behaviour
won't exactly match that but it's definitely in the same spirit.

Agreed also on requiring pagination (I've not reviewed the proposed pagination api in sufficient
detail to +1 it yet). Would we start the response as rows are retrieved, though? That's my
preference, with an unclean termination if we hit txn_too_old, and an upper bound on the "limit"
parameter or equivalent chosen such that txn_too_old is vanishingly unlikely.

On compatibility, there's precedent for a minor release of old branches just to add replicator
compatibility. for example, the replicator could call _changes again if it received a complete
_changes response (i.e, one that ended with a } that completes the json object) that did not
include a "last_seq" row. The 4.0 replicator would always do this.

B.

> On 16 Jul 2020, at 17:25, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> 
> From what I'm reading it sounds like we have general consensus on a few things:
> 
> 1. A single CouchDB API call should map to a single FDB transaction
> 2. We absolutely do not want to return a valid JSON response to any
> streaming API that hit a transaction boundary (because data
> loss/corruption)
> 3. We're willing to change the API requirements so that 2 is not an issue.
> 4. None of this applies to continuous changes since that API call was
> never a single snapshot.
> 
> If everyone generally agrees with that summarization, my suggestion
> would be that we just revisit the new pagination APIs and make them
> the only behavior rather than having them be opt-in. I believe those
> APIs already address all the concerns in this thread and the only
> reason we kept the older versions with `restart_tx` was to maintain
> API backwards compatibility at the expense of a slight change to
> semantics of snapshots. However, if there's a consensus that the
> semantics are more important than allowing a blanket `GET
> /db/_all_docs` I think it'd make the most sense to just embrace the
> pagination APIs that already exist and were written to cover these
> issues.
> 
> The only thing I'm not 100% on is how to deal with non-continuous
> replications. I.e., the older single shot replication. Do we go back
> with patches to older replicators to allow 4.0 compatibility? Just
> declare that you have to mediate a replication on the newer of the two
> CouchDB deployments? Sniff the replicator's UserAgent and behave
> differently on 4.x for just that special case?
> 
> Paul
> 
> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski <kocolosk@apache.org> wrote:
>> 
>> Sorry, I also missed that you quoted this specific bit about eagerly requesting a
new snapshot. Currently the code will just react to the transaction expiring, then wait till
it acquires a new snapshot if “restart_tx” is set (which can take a couple of milliseconds
on a FoundationDB cluster that is deployed across multiple AZs in a cloud Region) and then
proceed.
>> 
>> Adam
>> 
>>> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski <kocolosk@apache.org> wrote:
>>> 
>>> Right now the code has an internal “restart_tx” flag that is used to automatically
request a new snapshot if the original one expires and continue streaming the response. It
can be used for all manner of multi-row responses, not just _changes.
>>> 
>>> As this is a pretty big change to the isolation guarantees provided by the database
Bob volunteered to elevate the issue to the mailing list for a deeper discussion.
>>> 
>>> Cheers, Adam
>>> 
>>>> On Jul 15, 2020, at 11:38 AM, Joan Touzet <wohali@apache.org> wrote:
>>>> 
>>>> I'm having trouble following the thread...
>>>> 
>>>> On 14/07/2020 14:56, Adam Kocoloski wrote:
>>>>> For cases where you’re not concerned about the snapshot isolation (e.g.
streaming an entire _changes feed), there is a small performance benefit to requesting a new
FDB transaction asynchronously before the old one actually times out and swapping over to
it. That’s a pattern I’ve seen in other FDB layers but I’m not sure we’ve used it
anywhere in CouchDB yet.
>>>> 
>>>> How does _changes work right now in the proposed 4.0 code?
>>>> 
>>>> -Joan
>>> 
>> 


Mime
View raw message