couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson" <>
Subject Re: [DISCUSS] couchdb 4.0 transactional semantics
Date Wed, 15 Jul 2020 14:12:43 GMT

Thanks Jan

I would prefer not to have the configuration switch, instead remove what we don’t want.
As you said there’ll be a 3 / 4 split for a while (and not just for this reason). 
  Robert Samuel Newson

On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote:
> > On 14. Jul 2020, at 18:00, Adam Kocoloski <> wrote:
> > 
> > I think there’s tremendous value in being able to tell our users that each response
served by CouchDB is constructed from a single isolated snapshot of the underlying database.
I’d advocate for this being the default behavior of 4.0.
> I too am in favour of this. I apologise for not speaking up in the 
> earlier thread, which I followed closely, but never found the time to 
> respond to.
> From rnewson’s options, I’d suggest 3. the mandatory limit parameter. 
> While this does indeed mean a BC break, it teaches the right semantics 
> for folks on 4.0 and onwards. For client libraries like our own nano, 
> we can easily wrap this behaviour, so the resulting API is mostly 
> compatible still, at least when used in streaming mode, less so when 
> buffering a big _all_docs response).
> > If folks wanted to add an opt-in compatibility mode to support longer responses,
I suppose that could be OK. I think we should discourage that access pattern in general, though,
as it’s somewhat less friendly to various other parts of the stack than a pattern of shorter
responses and a smart pagination API like the one we’re introducing. To wit, I don’t think
we’d want to support that compatibility mode in IBM Cloud.
> Like Adam, I do not mind a compat mode, either through a different API 
> endpoint, or even a config option. I think we will be fine in getting 
> people on this path when we document this in our update guide for the 
> 4.0 release. I don’t think this will lead to a Python 2/3 situation 
> overall, because the 4.0+ features are compelling enough for relatively 
> small changes required, and CouchDB 3.x in its then latest form will 
> continue to be a fine database for years to come, for folks who can’t 
> upgrade as easily. So yes, I anticipate we’ll live in a two-versions 
> world a little longer than we did during 1.x to 2.x, but the reasons to 
> leave 1.x behind were a little more severe than the improvements of 4.x 
> over 3.x (while still significant, of course).
> Best
> Jan
> —
> > 
> > Adam
> > 
> >> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson <>
> >> 
> >> Thanks Nick, very helpful, and it vindicates me opening this thread.
> >> 
> >> I don't accept Mike Rhodes argument at all but I should explain why I don't;
> >> 
> >> In CouchDB 1.x, a response was generated from a single .couch file. There was
always a window between the start of the request as the client sees it and CouchDB acquiring
a snapshot of the relevant database. I don't think that gap is meaningful and does not refute
our statements of the time that CouchDB responses are from a snapshot (specifically, that
no change to the database made _during_ the response will be visible in _this_ response).
In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists of multiple
shards, each of which, once opened, remain snapshotted for the duration of that response.
The difference between 1.x and 2.x/3.x is that the window is potentially larger (though the
requests are issued in parallel). The response, however much it returned, was impervious to
changes in other requests once it has begun.
> >> 
> >> I don't think _all_docs, _view or a non-continuous _changes response should
allow changes made in other requests to appear midway through them and I want to hear the
opinions of folks that have watched over CouchDB from its earliest days on this specific point
(If I must name names, at least Adam K, Paul D, Jan L, Joan T). If there's a majority for
deviating from this semantic, I will go with the majority.
> >> 
> >> If we were to agree to preserve the 'single snapshot' behaviour, what would
the behaviour be if we can't honour it because of the FoundationDB transaction limits?
> >> 
> >> I see a few options.
> >> 
> >> 1) We could end the response uncleanly, mid-response. CouchDB does this when
it has no alternative, and it is ugly, but it is usually handled well by clients. They are
at least not usually convinced they got a complete response if they are using a competent
HTTP client.
> >> 
> >> 2) We could disavow the streaming API, as you've suggested, attempt to gather
the full response. If we do this within the FDB bounds, return a 200 code and the response
body. A 400 and an error body if we don't.
> >> 
> >> 3) We could make the "limit" parameter mandatory and with an upper bound, in
combination with 1 or 2, such that a valid request is very likely to be served within the
> >> 
> >> I'd like to hear more voices on which way we want to break the unachievable
semantic of old where you could read _all_docs on a billion document database over, uptime
gods willing, a snapshot of the database.
> >> 
> >> B.
> >> 
> >>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc <> wrote:
> >>> 
> >>> Thanks for bringing the topic up for the discussion!
> >>> 
> >>> For background, this topic was discussed on the mailing list starting
> >>> in February, 2019
> >>>
> >>> 
> >>> The primary reason for restart_tx option is to provide compatibility
> >>> for _changes feeds to allow older replicators to handle 4.0 sources.
> >>> It starts a new transaction after 5 seconds or so (a current FDB
> >>> limitation, might go up in the future) and transparently continues to
> >>> stream data where it left off. Ex, streaming [a,b,c,d], times out
> >>> after b, then it will continue with c, d etc. Currently this is also
> >>> used for other streaming APIs as an alternative to returning mangled
> >>> JSON after emitting a 200 response and streaming some of the rows.
> >>> However it is not used for paginated responses, the new APIs developed
> >>> by Ilya. So users have an option to get the guaranteed snapshot
> >>> behavior option as well.
> >>> 
> >>> And for completeness, if we decide to remove the option, we should
> >>> specify what happens if we remove it and get a transaction_too_old
> >>> exception. Currently the behavior would be to restart the transaction,
> >>> resend all the headers and all the rows again down the socket, which I
> >>> don't think anyone wants, but is what we'd get if we just make
> >>> {restart_tx, false}
> >>> 
> >>>> I understand that automatically resetting the FDB txn during a response
is an attempt to work around that and maintain "compatibility" with CouchDB < 4 semantics.
I think it fails to do so and is very misleading.
> >>> 
> >>> It is a trade-off in order to keep the same API shape as before. Sure,
> >>> streaming all the docs with _all_docs or _changes feeds is not a great
> >>> pattern but many applications are implemented that way already.
> >>> Letting them migrate to 4.0 without having to rewrite the application
> >>> with the caveat that they might see a document updated in the
> >>> _all_docs stream after the request has already started, is a nicer
> >>> choice, I think, than forcing them to rewrite their application, which
> >>> could lead to a python 2/3 scenario.
> >>> 
> >>> Due to having multiple shards (Q>1), as discussed in the original
> >>> mailing thread by Mike
> >>> (,
> >>> we don't provide a strict read-only snapshot guarantee in 2.x and 3.x
> >>> anyway, so users would have to handle scenarios where a document might
> >>> appear in the stream that wasn't there at the start of the request
> >>> already. Though, granted, a much smaller corner case but I wonder how
> >>> many users care to handle that...
> >>> 
> >>> Currently users do have an option of using the new paginated API which
> >>> disables restart_tx behavior
> >>>,
> >>> though I am not sure what happens when transaction_too_old exception
> >>> is thrown then (emit a bookmark?)
> >>> 
> >>> So based on the compatibility consideration, I'd vote to keep the
> >>> restart_tx option (configurable perhaps if we figure out what to do
> >>> when it is disabled) in order to allow users to migrate their
> >>> application to 4.0. At least informally we promised users to keep a
> >>> strong API compatibility when we released 3.0 with an eye towards 4.0
> >>> ( I'd
> >>> think not emitting all the data in a _changes or _all_docs response
> >>> would break that compatibility more than using multiple transactions.
> >>> 
> >>> As for what happens when a transaction_too_old is thrown, I could see
> >>> an option passed in, something like, single_snapshot=true, and then
> >>> use Adam's suggestion to accumulate all the rows in memory and if we
> >>> hit the end of the transaction return a 400 error. We won't emit
> >>> anything out while rows are accumulated, so users don't get partial
> >>> data, it will be every row requested or a 400 error (so no chance of
> >>> perceived data loss). Users may retry if they think it was a temporary
> >>> hiccup or may use a small limit number.
> >>> 
> >>> Cheers,
> >>> -Nick
> >>> 
> >>> On Mon, Jul 13, 2020 at 2:05 PM Robert Samuel Newson <>
> >>>> 
> >>>> Hi All,
> >>>> 
> >>>> I'm concerned to see the restart_fold function in fabric2_fdb (
in the 4.0 development branch.
> >>>> 
> >>>> The upshot of doing this is that a CouchDB response could be taken across
multiple snapshots of the database, which is not the behaviour of CouchDB 1 through 3.
> >>>> 
> >>>> I don't think this is ok (with the obvious and established exception
of a continuous changes feed, where new snapshots are continuously visible at the end of the
> >>>> 
> >>>> FoundationDB imposes certain limits on transactions, the most notable
being the 5 second maximum duration. I understand that automatically resetting the FDB txn
during a response is an attempt to work around that and maintain "compatibility" with CouchDB
< 4 semantics. I think it fails to do so and is very misleading.
> >>>> 
> >>>> Discuss.
> >>>> 
> >>>> B.
> >>>> 
> >> 
> > 

View raw message