couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <>
Subject Re: [DISCUSS] FoundationDB read versions and CouchDB requests
Date Thu, 10 Oct 2019 10:55:08 GMT

> On 7. Oct 2019, at 11:34, Mike Rhodes <> wrote:
> All,
> I think my email wasn't my clearest missive ever, so likely pretty easy to get lost in
it :) 
> I think my idea to include the read version in the document rev ID is likely a bad one.
But, if we are already including it in the database seq value, and we've done the work to
make that number transfer cleanly across FDB instances, there's probably some interesting
API directions post 4.0 where we make more use of that value towards more efficient RYW at
a database level.
> I'm beginning to get the feeling of what this API might look like and avoid being painful/confusing.
For example, given the more advanced nature of these APIs, I'd see them operating at the HTTP
header level, where we can provide request and response headers with the same name to support
sending/receiving database seq values across ~all read/write requests.
> Taking Joan and Adam's points on board, my view is that we semi-shelve this discussion
to enable focus on getting 4.0 out the door. But, I think we can start to introduce useful
functionality based on this during the 4.x series if we're careful (i.e., avoiding breaking
changes). Of course, we should probably step back to user pain points first as Joan implies,
otherwise we're building for the sake of building and the opportunity cost is not negligible.
> I think we might also want to hash out the "interchangable backend" question a bit more.
As Adam says, FDB gives us a number of features that would be hard to replicate on different
backends -- at least in the clustered case -- so nailing down a position there sounds important.

I think for now there is only need for one other backend and it is for lower-resource systems,
and I’d be fine with requiring them to be not clustered.

I’m thinking of RasPI and Desktop Software scenarios.


> -- 
> Mike.
> On Mon, 30 Sep 2019, at 18:12, Adam Kocoloski wrote:
>> Hi Joan,
>> Allowing clients to choose the DB sequence at which a read is performed 
>> won’t have any effect on replication.
>> If we end up enhancing _bulk_docs so that it can use a single 
>> transaction for all the documents in the batch then that’s where the 
>> replicator might need to get smarter, e.g. by inferring that a range of 
>> the _changes feed corresponds to a single transaction (based on 
>> knowledge of the Sequence encoding) and ensuring that the transaction 
>> is also written to the target atomically.
>> I hadn’t gotten as far as thinking about release numbers here, just 
>> thinking about what’s possible. You’re right about the positioning of 
>> 4.0, although that was largely an attempt to head off anyone thinking 
>> about a wholesale replacement of the current API as part of the 
>> FoundationDB work rather than a “no new features” ban. Cheers,
>> Adam
>>> On Sep 27, 2019, at 8:16 AM, Joan Touzet <> wrote:
>>> On 2019-09-26 17:04, Adam Kocoloski wrote:
>>>>> On Sep 26, 2019, at 1:38 PM, Joan Touzet <> wrote:
>>>>> On 2019-09-26 13:14, Adam Kocoloski wrote:
>>>>>> Hi Joan, no need for apologies! Snipping out a few bits:
>>>>>>> One alternative is to always keep just one around, and constantly
>>>>>>> it every 5s, whether it's used or not (idle server).
>>>>>> Agreed, I see no reason to keep multiple old read versions around
on a given CouchDB node. Updating it every second or two could be a nice thing to do (I wouldn’t
wait 5 seconds because handing out a read version 4.95 seconds old isn’t very useful to
anyone ;).
>>>>>>> This second option seems better, but as mentioned later we don't
want it
>>>>>>> to be a transparent FDB token (or convertible into one). This
>>>>>>> the nonce approach we use in _changes feeds to ensure a stable
feed, yeah?
>>>>>> In our current design we _do_ expose FDB versions pretty directly
as database update sequences (there’s a small prefix to allow for _changes to stay monotonically
increasing when relocating a database to a new FDB cluster). I believe it’s worth us thinking
about expanding the use of sequences to other places in the API as those are a concept that’s
already pretty familiar to our users
>>>>> Did users ever craft their own 2.x db update sequence tokens to abuse
>>>>> the system? Probably not, because our clustering code was hard to
>>>>> understand. Did users ever craft their own 1.x db update sequence
>>>>> values? Yes, and it caused lots of problems.
>>>> I don't remember the problems that this caused in 1.x, but I can certainly
imagine a too-clever user generating a sequence that doesn’t correspond to any consistent
FDB version and supplying it. FoundationDB allows for this sort of thing with the ominous
caveat: "The database cannot guarantee causal consistency if this method is used (the transaction’s
reads will be causally consistent only if the provided read version has that property).”
So … yeah.
>>> Eek. I don't like introducing new sharp edges.
>>>>> Does this prevent implementing the CouchDB API on any other backend?
>>>>> which case, I'd be -1.... In other words, at the very least we need to
>>>>> reinforce that the token is opaque and that manipulating it can produce
>>>>> both undefined errors as well as potentially lead to (perceived?) data
>>>> I mean, we’re already down the path where we are using various specific
features of FoundationDB (versionstamps, atomic operations, and of course transactions) that
would not necessarily be in an arbitrary key-value store. I suppose adding this enhancement
would add to the list of requirements on an underlying storage engine, but if a storage engine
couldn’t support transactions with snapshot isolation I’m not sure it’d be a good choice
for us anyway. Even something as basic as atomic maintenance of the _all_docs and _changes
indexes becomes a heroic effort without that.
>>> Two questions:
>>> I thought 4.0 was supposed to be "no new functionality, just 2.x/3.x
>>> semantics on top of FDB?" Is this something you're looking at for a
>>> later release?
>>> And how do you foresee e.g. PouchDB keeping up if they're not going to
>>> put FDB on a mobile device - will we necessarily be implementing API
>>> endpoints that demand an FDB backend that can't ever exist on a
>>> different implementation? How will this affect replication, for
>>> instance, if at all?
>>>>> If we eschew API changes for 4.0 then we need to decide on the default.
And if
>>>>>>> we're voting, I'd say making RYWs the default (never hanging
onto a
>>>>>>> handle) and then (ab-)using stale=ok or whatever state we have
>>>>>>> around might be sufficient.
>>>>>> I definitely agree. We should not be using old read versions without
the client’s knowledge unless it's for some internal process where we know all the tradeoffs.
>>>>>>> This is the really important data point here for me. While Cloudant
>>>>>>> cares about 2-3 extra ms on the server side, many many MANY CouchDB
>>>>>>> users don't. Can we benchmark what this looks like when running
>>>>>>> FDB+CouchDB on a teeny platform like a RasPi? Is it still 2-3ms?
>>>>>>> about the average laptop/desktop? Or is it only 2-3ms on a beefy
>>>>>>> Cloudant-sized server?
>>>>>> I don’t have hard performance numbers, but I expect that acquiring
a read version in a small-scale deployment is faster than the same operation against a big
FoundationDB deployment spanning zones in a cloud region. When you scale down e.g. to a single
FDB process that process ends up playing all the roles that need to collaborate to decide
on a read version and so the network latency gets taken out of the picture.
>>>>> Then I'm concerned this is premature optimization.
>>>> A fair concern. What I really like about this is that the way to more efficient
operations is exposing richer transactional semantics to users. How often do you get a deal
like that!
>>> Neat...but I'm always nervous when people spin new technology as a
>>> win-win, there's always a tradeoff, so I reserve the right to be
>>> skeptical until proven otherwise ;)
>>>> Cheers, Adam
>>> -Joan

Professional Support for Apache CouchDB:

View raw message