couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glynn Bird <glynn.b...@gmail.com>
Subject Re: [DISCUSS] Streaming API in CouchDB 4.0
Date Thu, 23 Apr 2020 21:33:49 GMT
I don't think a whole new API is required here, but I would like to see
some sort of "bookmark" facility for _all_docs and views, as pagination
with the current API is awkward.

I would imagine it working as follows:

// first request
curl $URL/mydb/_all_docs?startkey="aardvark"&endkey="moose"&limit=10

^ the user has specified the range of values that they want, and limit
defines the "page size", if you will.

If the response to the first request contained a bookmark, the
users's second and subsequent requests could look like:

curl $URL/mydb/_all_docs?bookmark=qfqwfqwfqfqw  // "get me page 2 of the
result set"
curl $URL/mydb/_all_docs?bookmark=iihwhfwfwhwh  // "get me page 3 of the
result set"

So the bookmark need only encode:

- limit (the page size)
- starkey - that is the previous page's endkey "+1" e.g dog/u0000
- endkey - the queries upper bound, as defined in the first request

Obviously there's other parameters to think through (start_key, end_key,
start_key_docid, descending etc) but this small iteration of the API
doesn't add any more clutter to request path while making it substantially
easier for the end user or client libraries to iterate through a result set.

On Thu, 23 Apr 2020 at 22:14, Paul Davis <paul.joseph.davis@gmail.com>
wrote:

> I'd agree that my initial reaction to cursor was that its not a great
> fit, but there does seem to be the existing name used in the greater
> REST world for this sort of pagination so I'm not concerned about
> using that terminology.
>
> I'm generally on board with allowing and setting some default sane
> limits on pages. We probably should have done that quite awhile ago
> after moving to native clustering and now that we have FDB limits I
> think it makes even more sense to have an API that does not lend
> itself to crazy errors when people are just trying to poke at an API.
>
> I think we're all on board that one of the goals is to make sure that
> clients don't accidentally misinterpret a response. That is, we're
> trying to be quite diligent that a user doesn't get 1000 rows and not
> realize there's another 10 that were beyond the limit. The bookmark
> approach with hard caps seems like a generally fine approach to me.
> The current approach users extra URL path segments to try and avoid
> this confusion. I wonder if we should consider starting to properly
> version our API using one of the many schemes that are used. Having
> read through a few articles I don't have a very clear favorite though.
>
> As to this particular proposal I do see a couple issues:
>
> `total` - We can do this in most cases fairly easily. Though it's a
> bit odd for continuous changes.
>
> `complete` - I'm not sure whether this is entirely possible given the
> API that FDB presents us. Specifically, when we set a range and we get
> back exactly $num_rows in the response, if the data set ended at
> exactly that page I don't think the `more` flag from fdb would tell us
> that. So we'd have a clunky UX there where we say not complete but the
> next page is empty. That's also not to mention that depending on
> whether we're looking at snapshots and so on that there's no way for
> us to know between stateless requests whether there were more rows
> added to the end.
>
> `page` - This one is just hard/impossible to calculate. FDB doesn't
> provide us with offsets or even an efficient "about how many rows in
> this range?" type queries so providing that would be both inaccurate
> and fairly difficult/expensive to calculate. In some cases I think we
> could have something maybe close that didn't suck too badly, but it'd
> also fall down for changes as well due to the way that updates reorder
> them.
>
> `update_seq` - I'm just not sure on when this would be useful or what
> it would refer to. Maybe a version stamp of the last change for that
> request? If we had a future API that asked for a snapshot access then
> maybe? But if we did do something there with versionstamps or read
> versions I'd expect that to come with the rest of the API.
>
> For the bookmark fields:
>
> `direction` vs `descending` seems like a field duplication to me.
>
> `page` - This would seem to suggest we could skip to a certain
> location in the results numerically which we are not able to do with
> the FDB API.
>
> `last_key` vs `start_key` seems like a field duplication. We don't
> need to know where things started I don't think. Just where to start
> from and where to end.
>
> `update_seq` - is same as earlier. Not entirely sure on the intent there.
>
> `timestamp` - Expiring bookmarks based on time does not seem like a
> good idea. Both for clock skew and why bother when this would
> functionally just be a convenience API that users could already
> implement for themselves.
>
> Another thing might also be to provide our bookmark as a full link
> that seems to be fairly standard REST practice these days. Something
> that clients don't have to do any logic with so that we're free to
> change the implementation.
>
> And lastly, I don't think we should be neglecting the _changes API as
> part of this discussion. I realize that we'll need to support the
> older streaming semantics if we want to maintain replication
> compatibility (which I think we'll all agree is a Good Thing) but it
> also feels a bit wrong to ignore it as part of this work if we're
> going to be modernizing our APIs. Though if we do pick up a good
> versioning scheme then we could theoretically make those changes
> easily enough. Plus, who doesn't want to rewrite chttpd to be a whole
> lot less... chttpd-y?
>
>
> On Thu, Apr 23, 2020 at 1:43 PM Robert Samuel Newson <rnewson@apache.org>
> wrote:
> >
> >
> > I think it's a key difference from "cursor" as I've seen them elsewhere,
> that ours will point at an ever changing database, you couldn't seamlessly
> cursor through a large data set, one "page" at a time.
> >
> > Bookmarks began in search (raises guilty hand) in order to address a
> Lucene-specific issue (that high values of "skip" are incredibly
> inefficient, using lots of RAM). That is not true for CouchDB's own
> indexes, which can be navigated perfectly with
> startkey/endkey/startkey_docid/endkey_docid, etc.
> >
> > I guess I'm not helping much with these observations but I wouldn't like
> to see CouchDB gain an additional and ugly method of doing something
> already possible.
> >
> > B.
> >
> > > On 23 Apr 2020, at 19:02, Joan Touzet <wohali@apache.org> wrote:
> > >
> > > I realise this is bikeshedding, but I guess that's kind of the
> point... Everything below is my opinion, not "fact."
> > >
> > > It's unfortunate we need a new endpoint for all of this. In a vacuum I
> might have just suggested we use the semantics we already have, perhaps
> with ?from= instead of ?since= .
> > >
> > > "page" only works if the size of a page is well known, either by
> server preference or directly in the URL. If I ask for:
> > >
> > >  GET /{db}/_all_docs?limit=20&page=3
> > >
> > > I know that I'm always going to get document 41 through 60 in the
> default collation order.
> > >
> > > There's a *fantastic* summary of examples from popular REST APIs here:
> > >
> > >
> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c
> > >
> > > We are *pretty close* to what a cursor means in those other examples,
> except for the fact that our cursor can go stale/invalid after a short time.
> > >
> > > Bob, could you be a bit more detailed in your explanation how our
> definition isn't close to these? Or did you mean SQL CURSOR (which is
> something entirely different?) If so, I'm fine with this being a REST API
> cursor - something clearly distinct.
> > >
> > > I come back to wanting to preserve the existing endpoint syntax and
> naming, without new endpoints, but specifying this new FDB token via
> ?cursor= and this being the trigger for the new behaviour. At some point,
> we simply stop accepting ?since= tokens. This seems inline with other
> popular REST APIs.
> > >
> > > -Joan "still sick and not sleeping right" Touzet
> > >
> > >
> > > On 2020-04-23 12:30, Robert Newson wrote:
> > >> cursor has established meaning in other databases and ours would not
> be very close to them. I don’t think it’s a good idea.
> > >> B.
> > >>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iilyak@apache.org>
wrote:
> > >>>
> > >>> 
> > >>>>
> > >>>> The best I could come up with is replacing page with
> > >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> > >>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
> > >>>
> > >>>> On 2020/04/23 08:54:36, Garren Smith <garren@apache.org>
wrote:
> > >>>> I agree with Bob that page doesn't make sense as an endpoint. I'm
> also
> > >>>> rubbish with naming. The best I could come up with is replacing
> page with
> > >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> > >>>> All the fields in the bookmark make sense except timestamp. Why
> would it
> > >>>> matter if the timestamp is old? What happens if a node's time is
an
> hour
> > >>>> behind another node?
> > >>>>
> > >>>>
> > >>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iilyak@apache.org>
> wrote:
> > >>>>>
> > >>>>> - page is to provide some notion of progress for user
> > >>>>> - timestamp - I was thinking that we should drop requests if
user
> would
> > >>>>> try to pass bookmark created an hour ago.
> > >>>>>
> > >>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnewson@apache.org>
> wrote:
> > >>>>>> "page" and "page number" are odd to me as these don't exist
as
> concepts,
> > >>>>> I'd rather not invent them. I note there's no mention of page
> size, which
> > >>>>> makes "page number" very vague.
> > >>>>>>
> > >>>>>> What is "timestamp" in the bookmark and what effect does
it have
> when
> > >>>>> the bookmark is passed back in?
> > >>>>>>
> > >>>>>> I guess, why does the bookmark include so much extraneous
data?
> Items
> > >>>>> that are not needed to find the fdb key to begin the next response
> from.
> > >>>>>>
> > >>>>>>
> > >>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iilyak@apache.org>
> wrote:
> > >>>>>>>
> > >>>>>>> Hello everyone,
> > >>>>>>>
> > >>>>>>> Based on the discussions on the thread I would like
to propose a
> > >>>>> number of first steps:
> > >>>>>>> 1) introduce new endpoints
> > >>>>>>> - {db}/_all_docs/page
> > >>>>>>> - {db}/_all_docs/queries/page
> > >>>>>>> - _all_dbs/page
> > >>>>>>> - _dbs_info/page
> > >>>>>>> - {db}/_design/{ddoc}/_view/{view}/page
> > >>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
> > >>>>>>> - {db}/_find/page
> > >>>>>>>
> > >>>>>>> These new endpoints would act as follows:
> > >>>>>>> - don't use delayed responses
> > >>>>>>> - return object with following structure
> > >>>>>>> ```
> > >>>>>>> {
> > >>>>>>>    "total": Total,
> > >>>>>>>    "bookmark": base64 encoded opaque value,
> > >>>>>>>    "completed": true | false,
> > >>>>>>>    "update_seq": when available,
> > >>>>>>>    "page": current page number,
> > >>>>>>>    "items": [
> > >>>>>>>    ]
> > >>>>>>> }
> > >>>>>>> ```
> > >>>>>>> - the bookmark would include following data (base64
or
> protobuff???):
> > >>>>>>> - direction
> > >>>>>>> - page
> > >>>>>>> - descending
> > >>>>>>> - endkey
> > >>>>>>> - endkey_docid
> > >>>>>>> - inclusive_end
> > >>>>>>> - startkey
> > >>>>>>> - startkey_docid
> > >>>>>>> - last_key
> > >>>>>>> - update_seq
> > >>>>>>> - timestamp
> > >>>>>>> ```
> > >>>>>>>
> > >>>>>>> 2) Implement per-endpoint configurable max limits
> > >>>>>>> ```
> > >>>>>>> _all_docs = 5000
> > >>>>>>> _all_docs/queries = 5000
> > >>>>>>> _all_dbs = 5000
> > >>>>>>> _dbs_info = 5000
> > >>>>>>> _view = 2500
> > >>>>>>> _view/queries = 2500
> > >>>>>>> _find = 2500
> > >>>>>>> ```
> > >>>>>>>
> > >>>>>>> Latter (after few years) CouchDB would deprecate and
remove old
> > >>>>> endpoints.
> > >>>>>>>
> > >>>>>>> Best regards,
> > >>>>>>> iilyak
> > >>>>>>>
> > >>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org>
> wrote:
> > >>>>>>>> Hello everyone,
> > >>>>>>>>
> > >>>>>>>> I'd like to discuss the shape and behavior of streaming
APIs for
> > >>>>> CouchDB 4.x
> > >>>>>>>>
> > >>>>>>>> By "streaming APIs" I mean APIs which stream data
in row as it
> gets
> > >>>>>>>> read from the database. These are the endpoints
I was thinking
> of:
> > >>>>>>>>
> > >>>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
> > >>>>>>>>
> > >>>>>>>> I want to focus on what happens when FoundationDB
transactions
> > >>>>>>>> time-out after 5 seconds. Currently, all those
APIs except
> _changes[1]
> > >>>>>>>> feeds, will crash or freeze. The reason is because
the
> > >>>>>>>> transaction_too_old error at the end of 5 seconds
is retry-able
> by
> > >>>>>>>> default, so the request handlers run again and
end up shoving
> the
> > >>>>>>>> whole request down the socket again, headers and
all, which is
> > >>>>>>>> obviously broken and not what we want.
> > >>>>>>>>
> > >>>>>>>> There are few alternatives discussed in couchdb-dev
channel.
> I'll
> > >>>>>>>> present some behaviors but feel free to add more.
Some ideas
> might
> > >>>>>>>> have been discounted on the IRC discussion already
but I'll
> present
> > >>>>>>>> them anyway in case is sparks further conversation:
> > >>>>>>>>
> > >>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction
and
> continue
> > >>>>>>>> streaming the data from the next key after last
emitted in the
> > >>>>>>>> previous transaction. Document the API behavior
change that it
> may
> > >>>>>>>> present a view of the data is never a point-in-time[4]
snapshot
> of the
> > >>>>>>>> DB.
> > >>>>>>>>
> > >>>>>>>> - Keeps the API shape the same as CouchDB <4.0.
Client libraries
> > >>>>>>>> don't have to change to continue using these CouchDB
4.0
> endpoints
> > >>>>>>>> - This is the easiest to implement since it would
re-use the
> > >>>>>>>> implementation for _changes feed (an extra option
passed to the
> fold
> > >>>>>>>> function).
> > >>>>>>>> - Breaks API behavior if users relied on having
a
> point-in-time[4]
> > >>>>>>>> snapshot view of the data.
> > >>>>>>>>
> > >>>>>>>> B) Simply end the stream. Let the users pass a
> `?transaction=true`
> > >>>>>>>> param which indicates they are aware the stream
may end early
> and so
> > >>>>>>>> would have to paginate from the last emitted key
with a skip=1.
> This
> > >>>>>>>> will keep the request bodies the same as current
CouchDB.
> However, if
> > >>>>>>>> the users got all the data one request, they will
end up wasting
> > >>>>>>>> another request to see if there is more data available.
If they
> didn't
> > >>>>>>>> get any data they might have a too large of a skip
value (see
> [2]) so
> > >>>>>>>> would have to guess different values for start/end
keys. Or
> impose max
> > >>>>>>>> limit for the `skip` parameter.
> > >>>>>>>>
> > >>>>>>>> C) End the stream and add a final metadata row
like a
> "transaction":
> > >>>>>>>> "timeout" at the end. That will let the user know
to keep
> paginating
> > >>>>>>>> from the last key onward. This won't work for `_all_dbs`
and
> > >>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave
like
> _changes
> > >>>>>>>> feeds and only use this for views and and _all_docs?
If we like
> this
> > >>>>>>>> choice, let's think what happens for those as I
couldn't come
> up with
> > >>>>>>>> anything decent there.
> > >>>>>>>>
> > >>>>>>>> D) Same as C but to solve the issue with skips[2],
emit a
> bookmark
> > >>>>>>>> "key" of where the iteration stopped and the current
"skip" and
> > >>>>>>>> "limit" params, which would keep decreasing. Then
user would
> pass
> > >>>>>>>> those in "start_key=..." in the next request along
with the
> limit and
> > >>>>>>>> skip params. So something like "continuation":{"skip":599,
> "limit":5,
> > >>>>>>>> "key":"..."}. This has the same issue with array
results for
> > >>>>>>>> `_all_dbs` and `_dbs_info`[3].
> > >>>>>>>>
> > >>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce
maximum
> values
> > >>>>>>>> there such that response time is likely to fit
in one
> transaction.
> > >>>>>>>> This could be tricky as different runtime environments
will have
> > >>>>>>>> different characteristics. Also, if the timeout
happens there
> isn't a
> > >>>>>>>> a nice way to send an HTTP error since we already
sent the 200
> > >>>>>>>> response. The downside is that this might break
how some users
> use the
> > >>>>>>>> API, if say the are using large skips and limits
already.
> Perhaps here
> > >>>>>>>> we do both B and D, such that if users want transactional
> behavior,
> > >>>>>>>> they specify that `transaction=true` param and
only then we
> enforce
> > >>>>>>>> low limit and skip maximums.
> > >>>>>>>>
> > >>>>>>>> F) At least for `_all_docs` it seems providing
a point-in-time
> > >>>>>>>> snapshot view doesn't necessarily need to be tied
to transaction
> > >>>>>>>> boundaries. We could check the update sequence
of the database
> at the
> > >>>>>>>> start of the next transaction and if it hasn't
changed we can
> continue
> > >>>>>>>> emitting a consistent view. This can apply to C
and D and would
> just
> > >>>>>>>> determine when the stream ends. If there are no
writes
> happening to
> > >>>>>>>> the db, this could potential streams all the data
just like
> option A
> > >>>>>>>> would do. Not entirely sure if this would work
for views.
> > >>>>>>>>
> > >>>>>>>> So what do we think? I can see different combinations
of
> options here,
> > >>>>>>>> maybe even different for each API point. For example
`_all_dbs`,
> > >>>>>>>> `_dbs_info` are always A, and `_all_docs` and views
default to
> A but
> > >>>>>>>> have parameters to do F, etc.
> > >>>>>>>>
> > >>>>>>>> Cheers,
> > >>>>>>>> -Nick
> > >>>>>>>>
> > >>>>>>>> Some footnotes:
> > >>>>>>>>
> > >>>>>>>> [1] _changes feeds is the only one that works currently.
It
> behaves as
> > >>>>>>>> per RFC
> > >>>>>
> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns
> > >>>>> .
> > >>>>>>>> That is, we continue streaming the data by resetting
the
> transaction
> > >>>>>>>> object and restarting from the last emitted key
(db sequence in
> this
> > >>>>>>>> case). However, because the transaction restarts
if a document
> is
> > >>>>>>>> updated while the streaming take place, it may
appear in the
> _changes
> > >>>>>>>> feed twice. That's a behavior difference from CouchDB
< 4.0 and
> we'd
> > >>>>>>>> have to document it, since previously we presented
this
> point-in-time
> > >>>>>>>> snapshot of the database from when we started streaming.
> > >>>>>>>>
> > >>>>>>>> [2] Our streaming APIs have both skips and limits.
Since FDB
> doesn't
> > >>>>>>>> currently support efficient offsets for key selectors
> > >>>>>>>> (
> > >>>>>
> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging
> > >>>>> )
> > >>>>>>>> we implemented skip by iterating over the data.
This means that
> a skip
> > >>>>>>>> of say 100000 could keep timing out the transaction
without
> yielding
> > >>>>>>>> any data.
> > >>>>>>>>
> > >>>>>>>> [3] _all_dbs and _dbs_info return a JSON array
so they don't
> have an
> > >>>>>>>> obvious place to insert a last metadata row.
> > >>>>>>>>
> > >>>>>>>> [4] For example they have a constraint that documents
"a" and
> "z"
> > >>>>>>>> cannot both be in the database at the same time.
But when
> iterating
> > >>>>>>>> it's possible that "a" was there at the start.
Then by the end,
> "a"
> > >>>>>>>> was removed and "z" added, so both "a" and "z"
would appear in
> the
> > >>>>>>>> emitted stream. Note that FoundationDB has APIs
which exhibit
> the same
> > >>>>>>>> "relaxed" constrains:
> > >>>>>>>>
> > >>>>>
> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message