couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Rhodes" <couc...@dx13.co.uk>
Subject Re: [DISCUSS] FoundationDB read versions and CouchDB requests
Date Mon, 07 Oct 2019 09:34:37 GMT
All,

I think my email wasn't my clearest missive ever, so likely pretty easy to get lost in it
:) 

I think my idea to include the read version in the document rev ID is likely a bad one. But,
if we are already including it in the database seq value, and we've done the work to make
that number transfer cleanly across FDB instances, there's probably some interesting API directions
post 4.0 where we make more use of that value towards more efficient RYW at a database level.

I'm beginning to get the feeling of what this API might look like and avoid being painful/confusing.
For example, given the more advanced nature of these APIs, I'd see them operating at the HTTP
header level, where we can provide request and response headers with the same name to support
sending/receiving database seq values across ~all read/write requests.

Taking Joan and Adam's points on board, my view is that we semi-shelve this discussion to
enable focus on getting 4.0 out the door. But, I think we can start to introduce useful functionality
based on this during the 4.x series if we're careful (i.e., avoiding breaking changes). Of
course, we should probably step back to user pain points first as Joan implies, otherwise
we're building for the sake of building and the opportunity cost is not negligible.

I think we might also want to hash out the "interchangable backend" question a bit more. As
Adam says, FDB gives us a number of features that would be hard to replicate on different
backends -- at least in the clustered case -- so nailing down a position there sounds important.

-- 
Mike.

On Mon, 30 Sep 2019, at 18:12, Adam Kocoloski wrote:
> Hi Joan,
> 
> Allowing clients to choose the DB sequence at which a read is performed 
> won’t have any effect on replication.
> 
> If we end up enhancing _bulk_docs so that it can use a single 
> transaction for all the documents in the batch then that’s where the 
> replicator might need to get smarter, e.g. by inferring that a range of 
> the _changes feed corresponds to a single transaction (based on 
> knowledge of the Sequence encoding) and ensuring that the transaction 
> is also written to the target atomically.
> 
> I hadn’t gotten as far as thinking about release numbers here, just 
> thinking about what’s possible. You’re right about the positioning of 
> 4.0, although that was largely an attempt to head off anyone thinking 
> about a wholesale replacement of the current API as part of the 
> FoundationDB work rather than a “no new features” ban. Cheers,
> 
> Adam
> 
> > On Sep 27, 2019, at 8:16 AM, Joan Touzet <wohali@apache.org> wrote:
> > 
> > 
> > 
> > On 2019-09-26 17:04, Adam Kocoloski wrote:
> >> 
> >>> On Sep 26, 2019, at 1:38 PM, Joan Touzet <wohali@apache.org> wrote:
> >>> 
> >>> On 2019-09-26 13:14, Adam Kocoloski wrote:
> >>>> Hi Joan, no need for apologies! Snipping out a few bits:
> >>>> 
> >>>>> One alternative is to always keep just one around, and constantly
update
> >>>>> it every 5s, whether it's used or not (idle server).
> >>>> 
> >>>> Agreed, I see no reason to keep multiple old read versions around on
a given CouchDB node. Updating it every second or two could be a nice thing to do (I wouldn’t
wait 5 seconds because handing out a read version 4.95 seconds old isn’t very useful to
anyone ;).
> >>>> 
> >>>>> This second option seems better, but as mentioned later we don't
want it
> >>>>> to be a transparent FDB token (or convertible into one). This parallels
> >>>>> the nonce approach we use in _changes feeds to ensure a stable feed,
yeah?
> >>>> 
> >>>> In our current design we _do_ expose FDB versions pretty directly as
database update sequences (there’s a small prefix to allow for _changes to stay monotonically
increasing when relocating a database to a new FDB cluster). I believe it’s worth us thinking
about expanding the use of sequences to other places in the API as those are a concept that’s
already pretty familiar to our users
> >>> 
> >>> Did users ever craft their own 2.x db update sequence tokens to abuse
> >>> the system? Probably not, because our clustering code was hard to
> >>> understand. Did users ever craft their own 1.x db update sequence
> >>> values? Yes, and it caused lots of problems.
> >> 
> >> I don't remember the problems that this caused in 1.x, but I can certainly imagine
a too-clever user generating a sequence that doesn’t correspond to any consistent FDB version
and supplying it. FoundationDB allows for this sort of thing with the ominous caveat: "The
database cannot guarantee causal consistency if this method is used (the transaction’s reads
will be causally consistent only if the provided read version has that property).” So …
yeah.
> > 
> > Eek. I don't like introducing new sharp edges.
> > 
> >> 
> >>> Does this prevent implementing the CouchDB API on any other backend? In
> >>> which case, I'd be -1.... In other words, at the very least we need to
> >>> reinforce that the token is opaque and that manipulating it can produce
> >>> both undefined errors as well as potentially lead to (perceived?) data loss.
> >> 
> >> I mean, we’re already down the path where we are using various specific features
of FoundationDB (versionstamps, atomic operations, and of course transactions) that would
not necessarily be in an arbitrary key-value store. I suppose adding this enhancement would
add to the list of requirements on an underlying storage engine, but if a storage engine couldn’t
support transactions with snapshot isolation I’m not sure it’d be a good choice for us
anyway. Even something as basic as atomic maintenance of the _all_docs and _changes indexes
becomes a heroic effort without that.
> > 
> > Two questions:
> > 
> > I thought 4.0 was supposed to be "no new functionality, just 2.x/3.x
> > semantics on top of FDB?" Is this something you're looking at for a
> > later release?
> > 
> > And how do you foresee e.g. PouchDB keeping up if they're not going to
> > put FDB on a mobile device - will we necessarily be implementing API
> > endpoints that demand an FDB backend that can't ever exist on a
> > different implementation? How will this affect replication, for
> > instance, if at all?
> > 
> >>> If we eschew API changes for 4.0 then we need to decide on the default.
And if
> >>>>> we're voting, I'd say making RYWs the default (never hanging onto
a
> >>>>> handle) and then (ab-)using stale=ok or whatever state we have lying
> >>>>> around might be sufficient.
> >>>> 
> >>>> I definitely agree. We should not be using old read versions without
the client’s knowledge unless it's for some internal process where we know all the tradeoffs.
> >>>> 
> >>>>> This is the really important data point here for me. While Cloudant
> >>>>> cares about 2-3 extra ms on the server side, many many MANY CouchDB
> >>>>> users don't. Can we benchmark what this looks like when running
> >>>>> FDB+CouchDB on a teeny platform like a RasPi? Is it still 2-3ms?
What
> >>>>> about the average laptop/desktop? Or is it only 2-3ms on a beefy
> >>>>> Cloudant-sized server?
> >>>> 
> >>>> I don’t have hard performance numbers, but I expect that acquiring
a read version in a small-scale deployment is faster than the same operation against a big
FoundationDB deployment spanning zones in a cloud region. When you scale down e.g. to a single
FDB process that process ends up playing all the roles that need to collaborate to decide
on a read version and so the network latency gets taken out of the picture.
> >>> 
> >>> Then I'm concerned this is premature optimization.
> >> 
> >> A fair concern. What I really like about this is that the way to more efficient
operations is exposing richer transactional semantics to users. How often do you get a deal
like that!
> > 
> > Neat...but I'm always nervous when people spin new technology as a
> > win-win, there's always a tradeoff, so I reserve the right to be
> > skeptical until proven otherwise ;)
> > 
> >> 
> >> Cheers, Adam
> >> 
> > 
> > -Joan
> 
>

Mime
View raw message