From dev-return-49269-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org  Fri Apr 24 10:05:18 2020
Return-Path: <dev-return-49269-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 2AB55180661
	for <archive-asf-public@cust-asf.ponee.io>; Fri, 24 Apr 2020 12:05:18 +0200 (CEST)
Received: (qmail 42524 invoked by uid 500); 24 Apr 2020 10:05:17 -0000
Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:dev-help@couchdb.apache.org>
List-Unsubscribe: <mailto:dev-unsubscribe@couchdb.apache.org>
List-Post: <mailto:dev@couchdb.apache.org>
List-Id: <dev.couchdb.apache.org>
Reply-To: dev@couchdb.apache.org
Delivered-To: mailing list dev@couchdb.apache.org
Received: (qmail 42463 invoked by uid 99); 24 Apr 2020 10:05:17 -0000
Received: from ui-eu-02.ponee.io (HELO localhost) (116.202.110.96)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Apr 2020 10:05:17 +0000
x-ponymail-sender: 3fd039c72976f9f89629fb5b95d0a929c183add0
Message-ID: <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-b6219992c13e23dda69f0b452042d1279071645f@dev.couchdb.apache.org>
Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0
References: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-700de13c04766aab597711c8ea53b1975d55b33b@dev.couchdb.apache.org>
 <BB9F65F9-7E8B-49C4-A1F9-7AE81A094EAF@rsn.io>
Date: Fri, 24 Apr 2020 10:05:16 -0000
In-Reply-To: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org>
MIME-Version: 1.0
To: <dev@couchdb.apache.org>
From: Ilya Khlopotov <iilyak@apache.org>
x-ponymail-agent: PonyMail Composer/0.2
X-Mailer: LuaSocket 3.0-rc1
Content-Type: text/plain; charset=utf-8

> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c

Very good article. My PoC experiment is in fact implementation of a cursor based pagination.
Event though the bookmark encodes all non default values of mrargs the algorithm only uses:
- limit - doesn't change
- start_key - updated for every bookmark as we iterate
- end_key - doesn't change
- direction - doesn't change

Best regards,
iilyak

On 2020/04/23 18:02:34, Joan Touzet <wohali@apache.org> wrote: 
> I realise this is bikeshedding, but I guess that's kind of the point... 
> Everything below is my opinion, not "fact."
> 
> It's unfortunate we need a new endpoint for all of this. In a vacuum I 
> might have just suggested we use the semantics we already have, perhaps 
> with ?from= instead of ?since= .
> 
> "page" only works if the size of a page is well known, either by server 
> preference or directly in the URL. If I ask for:
> 
>    GET /{db}/_all_docs?limit=20&page=3
> 
> I know that I'm always going to get document 41 through 60 in the 
> default collation order.
> 
> There's a *fantastic* summary of examples from popular REST APIs here:
> 
>  
> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c
> 
> We are *pretty close* to what a cursor means in those other examples, 
> except for the fact that our cursor can go stale/invalid after a short time.
> 
> Bob, could you be a bit more detailed in your explanation how our 
> definition isn't close to these? Or did you mean SQL CURSOR (which is 
> something entirely different?) If so, I'm fine with this being a REST 
> API cursor - something clearly distinct.
> 
> I come back to wanting to preserve the existing endpoint syntax and 
> naming, without new endpoints, but specifying this new FDB token via 
> ?cursor= and this being the trigger for the new behaviour. At some 
> point, we simply stop accepting ?since= tokens. This seems inline with 
> other popular REST APIs.
> 
> -Joan "still sick and not sleeping right" Touzet
> 
> 
> On 2020-04-23 12:30, Robert Newson wrote:
> > cursor has established meaning in other databases and ours would not be very close to them. I don’t think it’s a good idea.
> > 
> > B.
> > 
> >> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iilyak@apache.org> wrote:
> >>
> >> ﻿
> >>>
> >>> The best I could come up with is replacing page with
> >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> >> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
> >>
> >>> On 2020/04/23 08:54:36, Garren Smith <garren@apache.org> wrote:
> >>> I agree with Bob that page doesn't make sense as an endpoint. I'm also
> >>> rubbish with naming. The best I could come up with is replacing page with
> >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> >>> All the fields in the bookmark make sense except timestamp. Why would it
> >>> matter if the timestamp is old? What happens if a node's time is an hour
> >>> behind another node?
> >>>
> >>>
> >>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iilyak@apache.org> wrote:
> >>>>
> >>>> - page is to provide some notion of progress for user
> >>>> - timestamp - I was thinking that we should drop requests if user would
> >>>> try to pass bookmark created an hour ago.
> >>>>
> >>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnewson@apache.org> wrote:
> >>>>> "page" and "page number" are odd to me as these don't exist as concepts,
> >>>> I'd rather not invent them. I note there's no mention of page size, which
> >>>> makes "page number" very vague.
> >>>>>
> >>>>> What is "timestamp" in the bookmark and what effect does it have when
> >>>> the bookmark is passed back in?
> >>>>>
> >>>>> I guess, why does the bookmark include so much extraneous data? Items
> >>>> that are not needed to find the fdb key to begin the next response from.
> >>>>>
> >>>>>
> >>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iilyak@apache.org> wrote:
> >>>>>>
> >>>>>> Hello everyone,
> >>>>>>
> >>>>>> Based on the discussions on the thread I would like to propose a
> >>>> number of first steps:
> >>>>>> 1) introduce new endpoints
> >>>>>> - {db}/_all_docs/page
> >>>>>> - {db}/_all_docs/queries/page
> >>>>>> - _all_dbs/page
> >>>>>> - _dbs_info/page
> >>>>>> - {db}/_design/{ddoc}/_view/{view}/page
> >>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
> >>>>>> - {db}/_find/page
> >>>>>>
> >>>>>> These new endpoints would act as follows:
> >>>>>> - don't use delayed responses
> >>>>>> - return object with following structure
> >>>>>> ```
> >>>>>> {
> >>>>>>     "total": Total,
> >>>>>>     "bookmark": base64 encoded opaque value,
> >>>>>>     "completed": true | false,
> >>>>>>     "update_seq": when available,
> >>>>>>     "page": current page number,
> >>>>>>     "items": [
> >>>>>>     ]
> >>>>>> }
> >>>>>> ```
> >>>>>> - the bookmark would include following data (base64 or protobuff???):
> >>>>>> - direction
> >>>>>> - page
> >>>>>> - descending
> >>>>>> - endkey
> >>>>>> - endkey_docid
> >>>>>> - inclusive_end
> >>>>>> - startkey
> >>>>>> - startkey_docid
> >>>>>> - last_key
> >>>>>> - update_seq
> >>>>>> - timestamp
> >>>>>> ```
> >>>>>>
> >>>>>> 2) Implement per-endpoint configurable max limits
> >>>>>> ```
> >>>>>> _all_docs = 5000
> >>>>>> _all_docs/queries = 5000
> >>>>>> _all_dbs = 5000
> >>>>>> _dbs_info = 5000
> >>>>>> _view = 2500
> >>>>>> _view/queries = 2500
> >>>>>> _find = 2500
> >>>>>> ```
> >>>>>>
> >>>>>> Latter (after few years) CouchDB would deprecate and remove old
> >>>> endpoints.
> >>>>>>
> >>>>>> Best regards,
> >>>>>> iilyak
> >>>>>>
> >>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org> wrote:
> >>>>>>> Hello everyone,
> >>>>>>>
> >>>>>>> I'd like to discuss the shape and behavior of streaming APIs for
> >>>> CouchDB 4.x
> >>>>>>>
> >>>>>>> By "streaming APIs" I mean APIs which stream data in row as it gets
> >>>>>>> read from the database. These are the endpoints I was thinking of:
> >>>>>>>
> >>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
> >>>>>>>
> >>>>>>> I want to focus on what happens when FoundationDB transactions
> >>>>>>> time-out after 5 seconds. Currently, all those APIs except _changes[1]
> >>>>>>> feeds, will crash or freeze. The reason is because the
> >>>>>>> transaction_too_old error at the end of 5 seconds is retry-able by
> >>>>>>> default, so the request handlers run again and end up shoving the
> >>>>>>> whole request down the socket again, headers and all, which is
> >>>>>>> obviously broken and not what we want.
> >>>>>>>
> >>>>>>> There are few alternatives discussed in couchdb-dev channel. I'll
> >>>>>>> present some behaviors but feel free to add more. Some ideas might
> >>>>>>> have been discounted on the IRC discussion already but I'll present
> >>>>>>> them anyway in case is sparks further conversation:
> >>>>>>>
> >>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and continue
> >>>>>>> streaming the data from the next key after last emitted in the
> >>>>>>> previous transaction. Document the API behavior change that it may
> >>>>>>> present a view of the data is never a point-in-time[4] snapshot of the
> >>>>>>> DB.
> >>>>>>>
> >>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client libraries
> >>>>>>> don't have to change to continue using these CouchDB 4.0 endpoints
> >>>>>>> - This is the easiest to implement since it would re-use the
> >>>>>>> implementation for _changes feed (an extra option passed to the fold
> >>>>>>> function).
> >>>>>>> - Breaks API behavior if users relied on having a point-in-time[4]
> >>>>>>> snapshot view of the data.
> >>>>>>>
> >>>>>>> B) Simply end the stream. Let the users pass a `?transaction=true`
> >>>>>>> param which indicates they are aware the stream may end early and so
> >>>>>>> would have to paginate from the last emitted key with a skip=1. This
> >>>>>>> will keep the request bodies the same as current CouchDB. However, if
> >>>>>>> the users got all the data one request, they will end up wasting
> >>>>>>> another request to see if there is more data available. If they didn't
> >>>>>>> get any data they might have a too large of a skip value (see [2]) so
> >>>>>>> would have to guess different values for start/end keys. Or impose max
> >>>>>>> limit for the `skip` parameter.
> >>>>>>>
> >>>>>>> C) End the stream and add a final metadata row like a "transaction":
> >>>>>>> "timeout" at the end. That will let the user know to keep paginating
> >>>>>>> from the last key onward. This won't work for `_all_dbs` and
> >>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like _changes
> >>>>>>> feeds and only use this for views and and _all_docs? If we like this
> >>>>>>> choice, let's think what happens for those as I couldn't come up with
> >>>>>>> anything decent there.
> >>>>>>>
> >>>>>>> D) Same as C but to solve the issue with skips[2], emit a bookmark
> >>>>>>> "key" of where the iteration stopped and the current "skip" and
> >>>>>>> "limit" params, which would keep decreasing. Then user would pass
> >>>>>>> those in "start_key=..." in the next request along with the limit and
> >>>>>>> skip params. So something like "continuation":{"skip":599, "limit":5,
> >>>>>>> "key":"..."}. This has the same issue with array results for
> >>>>>>> `_all_dbs` and `_dbs_info`[3].
> >>>>>>>
> >>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum values
> >>>>>>> there such that response time is likely to fit in one transaction.
> >>>>>>> This could be tricky as different runtime environments will have
> >>>>>>> different characteristics. Also, if the timeout happens there isn't a
> >>>>>>> a nice way to send an HTTP error since we already sent the 200
> >>>>>>> response. The downside is that this might break how some users use the
> >>>>>>> API, if say the are using large skips and limits already. Perhaps here
> >>>>>>> we do both B and D, such that if users want transactional behavior,
> >>>>>>> they specify that `transaction=true` param and only then we enforce
> >>>>>>> low limit and skip maximums.
> >>>>>>>
> >>>>>>> F) At least for `_all_docs` it seems providing a point-in-time
> >>>>>>> snapshot view doesn't necessarily need to be tied to transaction
> >>>>>>> boundaries. We could check the update sequence of the database at the
> >>>>>>> start of the next transaction and if it hasn't changed we can continue
> >>>>>>> emitting a consistent view. This can apply to C and D and would just
> >>>>>>> determine when the stream ends. If there are no writes happening to
> >>>>>>> the db, this could potential streams all the data just like option A
> >>>>>>> would do. Not entirely sure if this would work for views.
> >>>>>>>
> >>>>>>> So what do we think? I can see different combinations of options here,
> >>>>>>> maybe even different for each API point. For example `_all_dbs`,
> >>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to A but
> >>>>>>> have parameters to do F, etc.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> -Nick
> >>>>>>>
> >>>>>>> Some footnotes:
> >>>>>>>
> >>>>>>> [1] _changes feeds is the only one that works currently. It behaves as
> >>>>>>> per RFC
> >>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns
> >>>> .
> >>>>>>> That is, we continue streaming the data by resetting the transaction
> >>>>>>> object and restarting from the last emitted key (db sequence in this
> >>>>>>> case). However, because the transaction restarts if a document is
> >>>>>>> updated while the streaming take place, it may appear in the _changes
> >>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and we'd
> >>>>>>> have to document it, since previously we presented this point-in-time
> >>>>>>> snapshot of the database from when we started streaming.
> >>>>>>>
> >>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB doesn't
> >>>>>>> currently support efficient offsets for key selectors
> >>>>>>> (
> >>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging
> >>>> )
> >>>>>>> we implemented skip by iterating over the data. This means that a skip
> >>>>>>> of say 100000 could keep timing out the transaction without yielding
> >>>>>>> any data.
> >>>>>>>
> >>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't have an
> >>>>>>> obvious place to insert a last metadata row.
> >>>>>>>
> >>>>>>> [4] For example they have a constraint that documents "a" and "z"
> >>>>>>> cannot both be in the database at the same time. But when iterating
> >>>>>>> it's possible that "a" was there at the start. Then by the end, "a"
> >>>>>>> was removed and "z" added, so both "a" and "z" would appear in the
> >>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit the same
> >>>>>>> "relaxed" constrains:
> >>>>>>>
> >>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> > 
>