From dev-return-49269-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Fri Apr 24 10:05:18 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 2AB55180661 for ; Fri, 24 Apr 2020 12:05:18 +0200 (CEST) Received: (qmail 42524 invoked by uid 500); 24 Apr 2020 10:05:17 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 42463 invoked by uid 99); 24 Apr 2020 10:05:17 -0000 Received: from ui-eu-02.ponee.io (HELO localhost) (116.202.110.96) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Apr 2020 10:05:17 +0000 x-ponymail-sender: 3fd039c72976f9f89629fb5b95d0a929c183add0 Message-ID: Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0 References: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> Date: Fri, 24 Apr 2020 10:05:16 -0000 In-Reply-To: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> MIME-Version: 1.0 To: From: Ilya Khlopotov x-ponymail-agent: PonyMail Composer/0.2 X-Mailer: LuaSocket 3.0-rc1 Content-Type: text/plain; charset=utf-8 > https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c Very good article. My PoC experiment is in fact implementation of a cursor based pagination. Event though the bookmark encodes all non default values of mrargs the algorithm only uses: - limit - doesn't change - start_key - updated for every bookmark as we iterate - end_key - doesn't change - direction - doesn't change Best regards, iilyak On 2020/04/23 18:02:34, Joan Touzet wrote: > I realise this is bikeshedding, but I guess that's kind of the point... > Everything below is my opinion, not "fact." > > It's unfortunate we need a new endpoint for all of this. In a vacuum I > might have just suggested we use the semantics we already have, perhaps > with ?from= instead of ?since= . > > "page" only works if the size of a page is well known, either by server > preference or directly in the URL. If I ask for: > > GET /{db}/_all_docs?limit=20&page=3 > > I know that I'm always going to get document 41 through 60 in the > default collation order. > > There's a *fantastic* summary of examples from popular REST APIs here: > > > https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c > > We are *pretty close* to what a cursor means in those other examples, > except for the fact that our cursor can go stale/invalid after a short time. > > Bob, could you be a bit more detailed in your explanation how our > definition isn't close to these? Or did you mean SQL CURSOR (which is > something entirely different?) If so, I'm fine with this being a REST > API cursor - something clearly distinct. > > I come back to wanting to preserve the existing endpoint syntax and > naming, without new endpoints, but specifying this new FDB token via > ?cursor= and this being the trigger for the new behaviour. At some > point, we simply stop accepting ?since= tokens. This seems inline with > other popular REST APIs. > > -Joan "still sick and not sleeping right" Touzet > > > On 2020-04-23 12:30, Robert Newson wrote: > > cursor has established meaning in other databases and ours would not be very close to them. I don’t think it’s a good idea. > > > > B. > > > >> On 23 Apr 2020, at 11:50, Ilya Khlopotov wrote: > >> > >>  > >>> > >>> The best I could come up with is replacing page with > >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs > >> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor). > >> > >>> On 2020/04/23 08:54:36, Garren Smith wrote: > >>> I agree with Bob that page doesn't make sense as an endpoint. I'm also > >>> rubbish with naming. The best I could come up with is replacing page with > >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs > >>> All the fields in the bookmark make sense except timestamp. Why would it > >>> matter if the timestamp is old? What happens if a node's time is an hour > >>> behind another node? > >>> > >>> > >>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov wrote: > >>>> > >>>> - page is to provide some notion of progress for user > >>>> - timestamp - I was thinking that we should drop requests if user would > >>>> try to pass bookmark created an hour ago. > >>>> > >>>> On 2020/04/22 21:58:40, Robert Samuel Newson wrote: > >>>>> "page" and "page number" are odd to me as these don't exist as concepts, > >>>> I'd rather not invent them. I note there's no mention of page size, which > >>>> makes "page number" very vague. > >>>>> > >>>>> What is "timestamp" in the bookmark and what effect does it have when > >>>> the bookmark is passed back in? > >>>>> > >>>>> I guess, why does the bookmark include so much extraneous data? Items > >>>> that are not needed to find the fdb key to begin the next response from. > >>>>> > >>>>> > >>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov wrote: > >>>>>> > >>>>>> Hello everyone, > >>>>>> > >>>>>> Based on the discussions on the thread I would like to propose a > >>>> number of first steps: > >>>>>> 1) introduce new endpoints > >>>>>> - {db}/_all_docs/page > >>>>>> - {db}/_all_docs/queries/page > >>>>>> - _all_dbs/page > >>>>>> - _dbs_info/page > >>>>>> - {db}/_design/{ddoc}/_view/{view}/page > >>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page > >>>>>> - {db}/_find/page > >>>>>> > >>>>>> These new endpoints would act as follows: > >>>>>> - don't use delayed responses > >>>>>> - return object with following structure > >>>>>> ``` > >>>>>> { > >>>>>> "total": Total, > >>>>>> "bookmark": base64 encoded opaque value, > >>>>>> "completed": true | false, > >>>>>> "update_seq": when available, > >>>>>> "page": current page number, > >>>>>> "items": [ > >>>>>> ] > >>>>>> } > >>>>>> ``` > >>>>>> - the bookmark would include following data (base64 or protobuff???): > >>>>>> - direction > >>>>>> - page > >>>>>> - descending > >>>>>> - endkey > >>>>>> - endkey_docid > >>>>>> - inclusive_end > >>>>>> - startkey > >>>>>> - startkey_docid > >>>>>> - last_key > >>>>>> - update_seq > >>>>>> - timestamp > >>>>>> ``` > >>>>>> > >>>>>> 2) Implement per-endpoint configurable max limits > >>>>>> ``` > >>>>>> _all_docs = 5000 > >>>>>> _all_docs/queries = 5000 > >>>>>> _all_dbs = 5000 > >>>>>> _dbs_info = 5000 > >>>>>> _view = 2500 > >>>>>> _view/queries = 2500 > >>>>>> _find = 2500 > >>>>>> ``` > >>>>>> > >>>>>> Latter (after few years) CouchDB would deprecate and remove old > >>>> endpoints. > >>>>>> > >>>>>> Best regards, > >>>>>> iilyak > >>>>>> > >>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc wrote: > >>>>>>> Hello everyone, > >>>>>>> > >>>>>>> I'd like to discuss the shape and behavior of streaming APIs for > >>>> CouchDB 4.x > >>>>>>> > >>>>>>> By "streaming APIs" I mean APIs which stream data in row as it gets > >>>>>>> read from the database. These are the endpoints I was thinking of: > >>>>>>> > >>>>>>> _all_docs, _all_dbs, _dbs_info and query results > >>>>>>> > >>>>>>> I want to focus on what happens when FoundationDB transactions > >>>>>>> time-out after 5 seconds. Currently, all those APIs except _changes[1] > >>>>>>> feeds, will crash or freeze. The reason is because the > >>>>>>> transaction_too_old error at the end of 5 seconds is retry-able by > >>>>>>> default, so the request handlers run again and end up shoving the > >>>>>>> whole request down the socket again, headers and all, which is > >>>>>>> obviously broken and not what we want. > >>>>>>> > >>>>>>> There are few alternatives discussed in couchdb-dev channel. I'll > >>>>>>> present some behaviors but feel free to add more. Some ideas might > >>>>>>> have been discounted on the IRC discussion already but I'll present > >>>>>>> them anyway in case is sparks further conversation: > >>>>>>> > >>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and continue > >>>>>>> streaming the data from the next key after last emitted in the > >>>>>>> previous transaction. Document the API behavior change that it may > >>>>>>> present a view of the data is never a point-in-time[4] snapshot of the > >>>>>>> DB. > >>>>>>> > >>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client libraries > >>>>>>> don't have to change to continue using these CouchDB 4.0 endpoints > >>>>>>> - This is the easiest to implement since it would re-use the > >>>>>>> implementation for _changes feed (an extra option passed to the fold > >>>>>>> function). > >>>>>>> - Breaks API behavior if users relied on having a point-in-time[4] > >>>>>>> snapshot view of the data. > >>>>>>> > >>>>>>> B) Simply end the stream. Let the users pass a `?transaction=true` > >>>>>>> param which indicates they are aware the stream may end early and so > >>>>>>> would have to paginate from the last emitted key with a skip=1. This > >>>>>>> will keep the request bodies the same as current CouchDB. However, if > >>>>>>> the users got all the data one request, they will end up wasting > >>>>>>> another request to see if there is more data available. If they didn't > >>>>>>> get any data they might have a too large of a skip value (see [2]) so > >>>>>>> would have to guess different values for start/end keys. Or impose max > >>>>>>> limit for the `skip` parameter. > >>>>>>> > >>>>>>> C) End the stream and add a final metadata row like a "transaction": > >>>>>>> "timeout" at the end. That will let the user know to keep paginating > >>>>>>> from the last key onward. This won't work for `_all_dbs` and > >>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like _changes > >>>>>>> feeds and only use this for views and and _all_docs? If we like this > >>>>>>> choice, let's think what happens for those as I couldn't come up with > >>>>>>> anything decent there. > >>>>>>> > >>>>>>> D) Same as C but to solve the issue with skips[2], emit a bookmark > >>>>>>> "key" of where the iteration stopped and the current "skip" and > >>>>>>> "limit" params, which would keep decreasing. Then user would pass > >>>>>>> those in "start_key=..." in the next request along with the limit and > >>>>>>> skip params. So something like "continuation":{"skip":599, "limit":5, > >>>>>>> "key":"..."}. This has the same issue with array results for > >>>>>>> `_all_dbs` and `_dbs_info`[3]. > >>>>>>> > >>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum values > >>>>>>> there such that response time is likely to fit in one transaction. > >>>>>>> This could be tricky as different runtime environments will have > >>>>>>> different characteristics. Also, if the timeout happens there isn't a > >>>>>>> a nice way to send an HTTP error since we already sent the 200 > >>>>>>> response. The downside is that this might break how some users use the > >>>>>>> API, if say the are using large skips and limits already. Perhaps here > >>>>>>> we do both B and D, such that if users want transactional behavior, > >>>>>>> they specify that `transaction=true` param and only then we enforce > >>>>>>> low limit and skip maximums. > >>>>>>> > >>>>>>> F) At least for `_all_docs` it seems providing a point-in-time > >>>>>>> snapshot view doesn't necessarily need to be tied to transaction > >>>>>>> boundaries. We could check the update sequence of the database at the > >>>>>>> start of the next transaction and if it hasn't changed we can continue > >>>>>>> emitting a consistent view. This can apply to C and D and would just > >>>>>>> determine when the stream ends. If there are no writes happening to > >>>>>>> the db, this could potential streams all the data just like option A > >>>>>>> would do. Not entirely sure if this would work for views. > >>>>>>> > >>>>>>> So what do we think? I can see different combinations of options here, > >>>>>>> maybe even different for each API point. For example `_all_dbs`, > >>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to A but > >>>>>>> have parameters to do F, etc. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> -Nick > >>>>>>> > >>>>>>> Some footnotes: > >>>>>>> > >>>>>>> [1] _changes feeds is the only one that works currently. It behaves as > >>>>>>> per RFC > >>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns > >>>> . > >>>>>>> That is, we continue streaming the data by resetting the transaction > >>>>>>> object and restarting from the last emitted key (db sequence in this > >>>>>>> case). However, because the transaction restarts if a document is > >>>>>>> updated while the streaming take place, it may appear in the _changes > >>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and we'd > >>>>>>> have to document it, since previously we presented this point-in-time > >>>>>>> snapshot of the database from when we started streaming. > >>>>>>> > >>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB doesn't > >>>>>>> currently support efficient offsets for key selectors > >>>>>>> ( > >>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging > >>>> ) > >>>>>>> we implemented skip by iterating over the data. This means that a skip > >>>>>>> of say 100000 could keep timing out the transaction without yielding > >>>>>>> any data. > >>>>>>> > >>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't have an > >>>>>>> obvious place to insert a last metadata row. > >>>>>>> > >>>>>>> [4] For example they have a constraint that documents "a" and "z" > >>>>>>> cannot both be in the database at the same time. But when iterating > >>>>>>> it's possible that "a" was there at the start. Then by the end, "a" > >>>>>>> was removed and "z" added, so both "a" and "z" would appear in the > >>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit the same > >>>>>>> "relaxed" constrains: > >>>>>>> > >>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality > >>>>>>> > >>>>> > >>>>> > >>>> > >>> > > >