couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Khlopotov <iil...@apache.org>
Subject Re: [DISCUSS] Streaming API in CouchDB 4.0
Date Fri, 24 Apr 2020 09:45:32 GMT
> On versioning, I've not seen a better article than this one: https://www.troyhunt.com/your-api-versioning-is-wrong-which-is/
I wouldn't propose new endpoint if we would have a strong story for API versioning. Currently
we don't. 
BTW we could put these new endpoints into a new namespace for example `_v2/_all_docs`. In
this case we wouldn't need to invent new names.

Best regards,
iilyak

On 2020/04/23 21:31:41, Robert Samuel Newson <rnewson@apache.org> wrote: 
> On versioning, I've not seen a better article than this one: https://www.troyhunt.com/your-api-versioning-is-wrong-which-is/
> 
> For _changes, definitely agree we should be including it in this discussion, it is the
only endpoint with, in theory, an eternal response, and I think that's a bug not a feature
these days. CouchDB exists in a wider ecosystem (and often behind a load balancer), it would
be good to define an upper bound on how long you can listen before being forced to query again.
> 
> B.
> 
> > On 23 Apr 2020, at 22:15, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> > 
> > I'd agree that my initial reaction to cursor was that its not a great
> > fit, but there does seem to be the existing name used in the greater
> > REST world for this sort of pagination so I'm not concerned about
> > using that terminology.
> > 
> > I'm generally on board with allowing and setting some default sane
> > limits on pages. We probably should have done that quite awhile ago
> > after moving to native clustering and now that we have FDB limits I
> > think it makes even more sense to have an API that does not lend
> > itself to crazy errors when people are just trying to poke at an API.
> > 
> > I think we're all on board that one of the goals is to make sure that
> > clients don't accidentally misinterpret a response. That is, we're
> > trying to be quite diligent that a user doesn't get 1000 rows and not
> > realize there's another 10 that were beyond the limit. The bookmark
> > approach with hard caps seems like a generally fine approach to me.
> > The current approach users extra URL path segments to try and avoid
> > this confusion. I wonder if we should consider starting to properly
> > version our API using one of the many schemes that are used. Having
> > read through a few articles I don't have a very clear favorite though.
> > 
> > As to this particular proposal I do see a couple issues:
> > 
> > `total` - We can do this in most cases fairly easily. Though it's a
> > bit odd for continuous changes.
> > 
> > `complete` - I'm not sure whether this is entirely possible given the
> > API that FDB presents us. Specifically, when we set a range and we get
> > back exactly $num_rows in the response, if the data set ended at
> > exactly that page I don't think the `more` flag from fdb would tell us
> > that. So we'd have a clunky UX there where we say not complete but the
> > next page is empty. That's also not to mention that depending on
> > whether we're looking at snapshots and so on that there's no way for
> > us to know between stateless requests whether there were more rows
> > added to the end.
> > 
> > `page` - This one is just hard/impossible to calculate. FDB doesn't
> > provide us with offsets or even an efficient "about how many rows in
> > this range?" type queries so providing that would be both inaccurate
> > and fairly difficult/expensive to calculate. In some cases I think we
> > could have something maybe close that didn't suck too badly, but it'd
> > also fall down for changes as well due to the way that updates reorder
> > them.
> > 
> > `update_seq` - I'm just not sure on when this would be useful or what
> > it would refer to. Maybe a version stamp of the last change for that
> > request? If we had a future API that asked for a snapshot access then
> > maybe? But if we did do something there with versionstamps or read
> > versions I'd expect that to come with the rest of the API.
> > 
> > For the bookmark fields:
> > 
> > `direction` vs `descending` seems like a field duplication to me.
> > 
> > `page` - This would seem to suggest we could skip to a certain
> > location in the results numerically which we are not able to do with
> > the FDB API.
> > 
> > `last_key` vs `start_key` seems like a field duplication. We don't
> > need to know where things started I don't think. Just where to start
> > from and where to end.
> > 
> > `update_seq` - is same as earlier. Not entirely sure on the intent there.
> > 
> > `timestamp` - Expiring bookmarks based on time does not seem like a
> > good idea. Both for clock skew and why bother when this would
> > functionally just be a convenience API that users could already
> > implement for themselves.
> > 
> > Another thing might also be to provide our bookmark as a full link
> > that seems to be fairly standard REST practice these days. Something
> > that clients don't have to do any logic with so that we're free to
> > change the implementation.
> > 
> > And lastly, I don't think we should be neglecting the _changes API as
> > part of this discussion. I realize that we'll need to support the
> > older streaming semantics if we want to maintain replication
> > compatibility (which I think we'll all agree is a Good Thing) but it
> > also feels a bit wrong to ignore it as part of this work if we're
> > going to be modernizing our APIs. Though if we do pick up a good
> > versioning scheme then we could theoretically make those changes
> > easily enough. Plus, who doesn't want to rewrite chttpd to be a whole
> > lot less... chttpd-y?
> > 
> > 
> > On Thu, Apr 23, 2020 at 1:43 PM Robert Samuel Newson <rnewson@apache.org>
wrote:
> >> 
> >> 
> >> I think it's a key difference from "cursor" as I've seen them elsewhere, that
ours will point at an ever changing database, you couldn't seamlessly cursor through a large
data set, one "page" at a time.
> >> 
> >> Bookmarks began in search (raises guilty hand) in order to address a Lucene-specific
issue (that high values of "skip" are incredibly inefficient, using lots of RAM). That is
not true for CouchDB's own indexes, which can be navigated perfectly with startkey/endkey/startkey_docid/endkey_docid,
etc.
> >> 
> >> I guess I'm not helping much with these observations but I wouldn't like to
see CouchDB gain an additional and ugly method of doing something already possible.
> >> 
> >> B.
> >> 
> >>> On 23 Apr 2020, at 19:02, Joan Touzet <wohali@apache.org> wrote:
> >>> 
> >>> I realise this is bikeshedding, but I guess that's kind of the point...
Everything below is my opinion, not "fact."
> >>> 
> >>> It's unfortunate we need a new endpoint for all of this. In a vacuum I might
have just suggested we use the semantics we already have, perhaps with ?from= instead of ?since=
.
> >>> 
> >>> "page" only works if the size of a page is well known, either by server
preference or directly in the URL. If I ask for:
> >>> 
> >>> GET /{db}/_all_docs?limit=20&page=3
> >>> 
> >>> I know that I'm always going to get document 41 through 60 in the default
collation order.
> >>> 
> >>> There's a *fantastic* summary of examples from popular REST APIs here:
> >>> 
> >>> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c
> >>> 
> >>> We are *pretty close* to what a cursor means in those other examples, except
for the fact that our cursor can go stale/invalid after a short time.
> >>> 
> >>> Bob, could you be a bit more detailed in your explanation how our definition
isn't close to these? Or did you mean SQL CURSOR (which is something entirely different?)
If so, I'm fine with this being a REST API cursor - something clearly distinct.
> >>> 
> >>> I come back to wanting to preserve the existing endpoint syntax and naming,
without new endpoints, but specifying this new FDB token via ?cursor= and this being the trigger
for the new behaviour. At some point, we simply stop accepting ?since= tokens. This seems
inline with other popular REST APIs.
> >>> 
> >>> -Joan "still sick and not sleeping right" Touzet
> >>> 
> >>> 
> >>> On 2020-04-23 12:30, Robert Newson wrote:
> >>>> cursor has established meaning in other databases and ours would not
be very close to them. I don’t think it’s a good idea.
> >>>> B.
> >>>>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iilyak@apache.org>
wrote:
> >>>>> 
> >>>>> 
> >>>>>> 
> >>>>>> The best I could come up with is replacing page with
> >>>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> >>>>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
> >>>>> 
> >>>>>> On 2020/04/23 08:54:36, Garren Smith <garren@apache.org>
wrote:
> >>>>>> I agree with Bob that page doesn't make sense as an endpoint.
I'm also
> >>>>>> rubbish with naming. The best I could come up with is replacing
page with
> >>>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> >>>>>> All the fields in the bookmark make sense except timestamp.
Why would it
> >>>>>> matter if the timestamp is old? What happens if a node's time
is an hour
> >>>>>> behind another node?
> >>>>>> 
> >>>>>> 
> >>>>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iilyak@apache.org>
wrote:
> >>>>>>> 
> >>>>>>> - page is to provide some notion of progress for user
> >>>>>>> - timestamp - I was thinking that we should drop requests
if user would
> >>>>>>> try to pass bookmark created an hour ago.
> >>>>>>> 
> >>>>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnewson@apache.org>
wrote:
> >>>>>>>> "page" and "page number" are odd to me as these don't
exist as concepts,
> >>>>>>> I'd rather not invent them. I note there's no mention of
page size, which
> >>>>>>> makes "page number" very vague.
> >>>>>>>> 
> >>>>>>>> What is "timestamp" in the bookmark and what effect
does it have when
> >>>>>>> the bookmark is passed back in?
> >>>>>>>> 
> >>>>>>>> I guess, why does the bookmark include so much extraneous
data? Items
> >>>>>>> that are not needed to find the fdb key to begin the next
response from.
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iilyak@apache.org>
wrote:
> >>>>>>>>> 
> >>>>>>>>> Hello everyone,
> >>>>>>>>> 
> >>>>>>>>> Based on the discussions on the thread I would like
to propose a
> >>>>>>> number of first steps:
> >>>>>>>>> 1) introduce new endpoints
> >>>>>>>>> - {db}/_all_docs/page
> >>>>>>>>> - {db}/_all_docs/queries/page
> >>>>>>>>> - _all_dbs/page
> >>>>>>>>> - _dbs_info/page
> >>>>>>>>> - {db}/_design/{ddoc}/_view/{view}/page
> >>>>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
> >>>>>>>>> - {db}/_find/page
> >>>>>>>>> 
> >>>>>>>>> These new endpoints would act as follows:
> >>>>>>>>> - don't use delayed responses
> >>>>>>>>> - return object with following structure
> >>>>>>>>> ```
> >>>>>>>>> {
> >>>>>>>>>   "total": Total,
> >>>>>>>>>   "bookmark": base64 encoded opaque value,
> >>>>>>>>>   "completed": true | false,
> >>>>>>>>>   "update_seq": when available,
> >>>>>>>>>   "page": current page number,
> >>>>>>>>>   "items": [
> >>>>>>>>>   ]
> >>>>>>>>> }
> >>>>>>>>> ```
> >>>>>>>>> - the bookmark would include following data (base64
or protobuff???):
> >>>>>>>>> - direction
> >>>>>>>>> - page
> >>>>>>>>> - descending
> >>>>>>>>> - endkey
> >>>>>>>>> - endkey_docid
> >>>>>>>>> - inclusive_end
> >>>>>>>>> - startkey
> >>>>>>>>> - startkey_docid
> >>>>>>>>> - last_key
> >>>>>>>>> - update_seq
> >>>>>>>>> - timestamp
> >>>>>>>>> ```
> >>>>>>>>> 
> >>>>>>>>> 2) Implement per-endpoint configurable max limits
> >>>>>>>>> ```
> >>>>>>>>> _all_docs = 5000
> >>>>>>>>> _all_docs/queries = 5000
> >>>>>>>>> _all_dbs = 5000
> >>>>>>>>> _dbs_info = 5000
> >>>>>>>>> _view = 2500
> >>>>>>>>> _view/queries = 2500
> >>>>>>>>> _find = 2500
> >>>>>>>>> ```
> >>>>>>>>> 
> >>>>>>>>> Latter (after few years) CouchDB would deprecate
and remove old
> >>>>>>> endpoints.
> >>>>>>>>> 
> >>>>>>>>> Best regards,
> >>>>>>>>> iilyak
> >>>>>>>>> 
> >>>>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org>
wrote:
> >>>>>>>>>> Hello everyone,
> >>>>>>>>>> 
> >>>>>>>>>> I'd like to discuss the shape and behavior of
streaming APIs for
> >>>>>>> CouchDB 4.x
> >>>>>>>>>> 
> >>>>>>>>>> By "streaming APIs" I mean APIs which stream
data in row as it gets
> >>>>>>>>>> read from the database. These are the endpoints
I was thinking of:
> >>>>>>>>>> 
> >>>>>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
> >>>>>>>>>> 
> >>>>>>>>>> I want to focus on what happens when FoundationDB
transactions
> >>>>>>>>>> time-out after 5 seconds. Currently, all those
APIs except _changes[1]
> >>>>>>>>>> feeds, will crash or freeze. The reason is because
the
> >>>>>>>>>> transaction_too_old error at the end of 5 seconds
is retry-able by
> >>>>>>>>>> default, so the request handlers run again and
end up shoving the
> >>>>>>>>>> whole request down the socket again, headers
and all, which is
> >>>>>>>>>> obviously broken and not what we want.
> >>>>>>>>>> 
> >>>>>>>>>> There are few alternatives discussed in couchdb-dev
channel. I'll
> >>>>>>>>>> present some behaviors but feel free to add
more. Some ideas might
> >>>>>>>>>> have been discounted on the IRC discussion already
but I'll present
> >>>>>>>>>> them anyway in case is sparks further conversation:
> >>>>>>>>>> 
> >>>>>>>>>> A) Do what _changes[1] feeds do. Start a new
transaction and continue
> >>>>>>>>>> streaming the data from the next key after last
emitted in the
> >>>>>>>>>> previous transaction. Document the API behavior
change that it may
> >>>>>>>>>> present a view of the data is never a point-in-time[4]
snapshot of the
> >>>>>>>>>> DB.
> >>>>>>>>>> 
> >>>>>>>>>> - Keeps the API shape the same as CouchDB <4.0.
Client libraries
> >>>>>>>>>> don't have to change to continue using these
CouchDB 4.0 endpoints
> >>>>>>>>>> - This is the easiest to implement since it
would re-use the
> >>>>>>>>>> implementation for _changes feed (an extra option
passed to the fold
> >>>>>>>>>> function).
> >>>>>>>>>> - Breaks API behavior if users relied on having
a point-in-time[4]
> >>>>>>>>>> snapshot view of the data.
> >>>>>>>>>> 
> >>>>>>>>>> B) Simply end the stream. Let the users pass
a `?transaction=true`
> >>>>>>>>>> param which indicates they are aware the stream
may end early and so
> >>>>>>>>>> would have to paginate from the last emitted
key with a skip=1. This
> >>>>>>>>>> will keep the request bodies the same as current
CouchDB. However, if
> >>>>>>>>>> the users got all the data one request, they
will end up wasting
> >>>>>>>>>> another request to see if there is more data
available. If they didn't
> >>>>>>>>>> get any data they might have a too large of
a skip value (see [2]) so
> >>>>>>>>>> would have to guess different values for start/end
keys. Or impose max
> >>>>>>>>>> limit for the `skip` parameter.
> >>>>>>>>>> 
> >>>>>>>>>> C) End the stream and add a final metadata row
like a "transaction":
> >>>>>>>>>> "timeout" at the end. That will let the user
know to keep paginating
> >>>>>>>>>> from the last key onward. This won't work for
`_all_dbs` and
> >>>>>>>>>> `_dbs_info`[3] Maybe let those two endpoints
behave like _changes
> >>>>>>>>>> feeds and only use this for views and and _all_docs?
If we like this
> >>>>>>>>>> choice, let's think what happens for those as
I couldn't come up with
> >>>>>>>>>> anything decent there.
> >>>>>>>>>> 
> >>>>>>>>>> D) Same as C but to solve the issue with skips[2],
emit a bookmark
> >>>>>>>>>> "key" of where the iteration stopped and the
current "skip" and
> >>>>>>>>>> "limit" params, which would keep decreasing.
Then user would pass
> >>>>>>>>>> those in "start_key=..." in the next request
along with the limit and
> >>>>>>>>>> skip params. So something like "continuation":{"skip":599,
"limit":5,
> >>>>>>>>>> "key":"..."}. This has the same issue with array
results for
> >>>>>>>>>> `_all_dbs` and `_dbs_info`[3].
> >>>>>>>>>> 
> >>>>>>>>>> E) Enforce low `limit` and `skip` parameters.
Enforce maximum values
> >>>>>>>>>> there such that response time is likely to fit
in one transaction.
> >>>>>>>>>> This could be tricky as different runtime environments
will have
> >>>>>>>>>> different characteristics. Also, if the timeout
happens there isn't a
> >>>>>>>>>> a nice way to send an HTTP error since we already
sent the 200
> >>>>>>>>>> response. The downside is that this might break
how some users use the
> >>>>>>>>>> API, if say the are using large skips and limits
already. Perhaps here
> >>>>>>>>>> we do both B and D, such that if users want
transactional behavior,
> >>>>>>>>>> they specify that `transaction=true` param and
only then we enforce
> >>>>>>>>>> low limit and skip maximums.
> >>>>>>>>>> 
> >>>>>>>>>> F) At least for `_all_docs` it seems providing
a point-in-time
> >>>>>>>>>> snapshot view doesn't necessarily need to be
tied to transaction
> >>>>>>>>>> boundaries. We could check the update sequence
of the database at the
> >>>>>>>>>> start of the next transaction and if it hasn't
changed we can continue
> >>>>>>>>>> emitting a consistent view. This can apply to
C and D and would just
> >>>>>>>>>> determine when the stream ends. If there are
no writes happening to
> >>>>>>>>>> the db, this could potential streams all the
data just like option A
> >>>>>>>>>> would do. Not entirely sure if this would work
for views.
> >>>>>>>>>> 
> >>>>>>>>>> So what do we think? I can see different combinations
of options here,
> >>>>>>>>>> maybe even different for each API point. For
example `_all_dbs`,
> >>>>>>>>>> `_dbs_info` are always A, and `_all_docs` and
views default to A but
> >>>>>>>>>> have parameters to do F, etc.
> >>>>>>>>>> 
> >>>>>>>>>> Cheers,
> >>>>>>>>>> -Nick
> >>>>>>>>>> 
> >>>>>>>>>> Some footnotes:
> >>>>>>>>>> 
> >>>>>>>>>> [1] _changes feeds is the only one that works
currently. It behaves as
> >>>>>>>>>> per RFC
> >>>>>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns
> >>>>>>> .
> >>>>>>>>>> That is, we continue streaming the data by resetting
the transaction
> >>>>>>>>>> object and restarting from the last emitted
key (db sequence in this
> >>>>>>>>>> case). However, because the transaction restarts
if a document is
> >>>>>>>>>> updated while the streaming take place, it may
appear in the _changes
> >>>>>>>>>> feed twice. That's a behavior difference from
CouchDB < 4.0 and we'd
> >>>>>>>>>> have to document it, since previously we presented
this point-in-time
> >>>>>>>>>> snapshot of the database from when we started
streaming.
> >>>>>>>>>> 
> >>>>>>>>>> [2] Our streaming APIs have both skips and limits.
Since FDB doesn't
> >>>>>>>>>> currently support efficient offsets for key
selectors
> >>>>>>>>>> (
> >>>>>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging
> >>>>>>> )
> >>>>>>>>>> we implemented skip by iterating over the data.
This means that a skip
> >>>>>>>>>> of say 100000 could keep timing out the transaction
without yielding
> >>>>>>>>>> any data.
> >>>>>>>>>> 
> >>>>>>>>>> [3] _all_dbs and _dbs_info return a JSON array
so they don't have an
> >>>>>>>>>> obvious place to insert a last metadata row.
> >>>>>>>>>> 
> >>>>>>>>>> [4] For example they have a constraint that
documents "a" and "z"
> >>>>>>>>>> cannot both be in the database at the same time.
But when iterating
> >>>>>>>>>> it's possible that "a" was there at the start.
Then by the end, "a"
> >>>>>>>>>> was removed and "z" added, so both "a" and "z"
would appear in the
> >>>>>>>>>> emitted stream. Note that FoundationDB has APIs
which exhibit the same
> >>>>>>>>>> "relaxed" constrains:
> >>>>>>>>>> 
> >>>>>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
> >>>>>>>>>> 
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>> 
> >>>>>> 
> >> 
> 
> 

Mime
View raw message