From dev-return-49262-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Thu Apr 23 18:43:53 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0E32E18065D for ; Thu, 23 Apr 2020 20:43:52 +0200 (CEST) Received: (qmail 34313 invoked by uid 500); 23 Apr 2020 18:43:52 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 34301 invoked by uid 99); 23 Apr 2020 18:43:52 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2020 18:43:52 +0000 Received: from auth1-smtp.messagingengine.com (auth1-smtp.messagingengine.com [66.111.4.227]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 1978BF64 for ; Thu, 23 Apr 2020 18:43:52 +0000 (UTC) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailauth.nyi.internal (Postfix) with ESMTP id E3F9D27C0054 for ; Thu, 23 Apr 2020 14:43:51 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Thu, 23 Apr 2020 14:43:51 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrgeelgdelkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecuogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenuc fjughrpefhtgfgggfuffhfvfgjkffosehtqhhmtdhhtdejnecuhfhrohhmpeftohgsvghr thcuufgrmhhuvghlucfpvgifshhonhcuoehrnhgvfihsohhnsegrphgrtghhvgdrohhrgh eqnecuffhomhgrihhnpehmvgguihhumhdrtghomhdpghhithhhuhgsrdgtohhmpdhgihht hhhusgdrihhonecukfhppeekledrvdefkedrudehgedrvdegudenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrnhgvfihsohhnodhmvghsmhht phgruhhthhhpvghrshhonhgrlhhithihqdelfeegvddtvdejvddqudduleegjedtjeejqd hrnhgvfihsohhnpeeprghprggthhgvrdhorhhgsehfrghsthhmrghilhdrfhhm X-ME-Proxy: Received: from [10.200.148.242] (unknown [89.238.154.241]) by mail.messagingengine.com (Postfix) with ESMTPA id 40F003065D30 for ; Thu, 23 Apr 2020 14:43:51 -0400 (EDT) From: Robert Samuel Newson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0 Date: Thu, 23 Apr 2020 19:43:50 +0100 References: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> To: CouchDB Developers In-Reply-To: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> Message-Id: <4BA02B48-F3ED-412B-966C-34D594055FE2@apache.org> X-Mailer: Apple Mail (2.3608.80.23.2.2) I think it's a key difference from "cursor" as I've seen them elsewhere, = that ours will point at an ever changing database, you couldn't = seamlessly cursor through a large data set, one "page" at a time. Bookmarks began in search (raises guilty hand) in order to address a = Lucene-specific issue (that high values of "skip" are incredibly = inefficient, using lots of RAM). That is not true for CouchDB's own = indexes, which can be navigated perfectly with = startkey/endkey/startkey_docid/endkey_docid, etc. I guess I'm not helping much with these observations but I wouldn't like = to see CouchDB gain an additional and ugly method of doing something = already possible. B. > On 23 Apr 2020, at 19:02, Joan Touzet wrote: >=20 > I realise this is bikeshedding, but I guess that's kind of the = point... Everything below is my opinion, not "fact." >=20 > It's unfortunate we need a new endpoint for all of this. In a vacuum I = might have just suggested we use the semantics we already have, perhaps = with ?from=3D instead of ?since=3D . >=20 > "page" only works if the size of a page is well known, either by = server preference or directly in the URL. If I ask for: >=20 > GET /{db}/_all_docs?limit=3D20&page=3D3 >=20 > I know that I'm always going to get document 41 through 60 in the = default collation order. >=20 > There's a *fantastic* summary of examples from popular REST APIs here: >=20 > = https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4= c >=20 > We are *pretty close* to what a cursor means in those other examples, = except for the fact that our cursor can go stale/invalid after a short = time. >=20 > Bob, could you be a bit more detailed in your explanation how our = definition isn't close to these? Or did you mean SQL CURSOR (which is = something entirely different?) If so, I'm fine with this being a REST = API cursor - something clearly distinct. >=20 > I come back to wanting to preserve the existing endpoint syntax and = naming, without new endpoints, but specifying this new FDB token via = ?cursor=3D and this being the trigger for the new behaviour. At some = point, we simply stop accepting ?since=3D tokens. This seems inline with = other popular REST APIs. >=20 > -Joan "still sick and not sleeping right" Touzet >=20 >=20 > On 2020-04-23 12:30, Robert Newson wrote: >> cursor has established meaning in other databases and ours would not = be very close to them. I don=E2=80=99t think it=E2=80=99s a good idea. >> B. >>> On 23 Apr 2020, at 11:50, Ilya Khlopotov wrote: >>>=20 >>> =EF=BB=BF >>>>=20 >>>> The best I could come up with is replacing page with >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs >>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor). >>>=20 >>>> On 2020/04/23 08:54:36, Garren Smith wrote: >>>> I agree with Bob that page doesn't make sense as an endpoint. I'm = also >>>> rubbish with naming. The best I could come up with is replacing = page with >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs >>>> All the fields in the bookmark make sense except timestamp. Why = would it >>>> matter if the timestamp is old? What happens if a node's time is an = hour >>>> behind another node? >>>>=20 >>>>=20 >>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov = wrote: >>>>>=20 >>>>> - page is to provide some notion of progress for user >>>>> - timestamp - I was thinking that we should drop requests if user = would >>>>> try to pass bookmark created an hour ago. >>>>>=20 >>>>> On 2020/04/22 21:58:40, Robert Samuel Newson = wrote: >>>>>> "page" and "page number" are odd to me as these don't exist as = concepts, >>>>> I'd rather not invent them. I note there's no mention of page = size, which >>>>> makes "page number" very vague. >>>>>>=20 >>>>>> What is "timestamp" in the bookmark and what effect does it have = when >>>>> the bookmark is passed back in? >>>>>>=20 >>>>>> I guess, why does the bookmark include so much extraneous data? = Items >>>>> that are not needed to find the fdb key to begin the next response = from. >>>>>>=20 >>>>>>=20 >>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov = wrote: >>>>>>>=20 >>>>>>> Hello everyone, >>>>>>>=20 >>>>>>> Based on the discussions on the thread I would like to propose a >>>>> number of first steps: >>>>>>> 1) introduce new endpoints >>>>>>> - {db}/_all_docs/page >>>>>>> - {db}/_all_docs/queries/page >>>>>>> - _all_dbs/page >>>>>>> - _dbs_info/page >>>>>>> - {db}/_design/{ddoc}/_view/{view}/page >>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page >>>>>>> - {db}/_find/page >>>>>>>=20 >>>>>>> These new endpoints would act as follows: >>>>>>> - don't use delayed responses >>>>>>> - return object with following structure >>>>>>> ``` >>>>>>> { >>>>>>> "total": Total, >>>>>>> "bookmark": base64 encoded opaque value, >>>>>>> "completed": true | false, >>>>>>> "update_seq": when available, >>>>>>> "page": current page number, >>>>>>> "items": [ >>>>>>> ] >>>>>>> } >>>>>>> ``` >>>>>>> - the bookmark would include following data (base64 or = protobuff???): >>>>>>> - direction >>>>>>> - page >>>>>>> - descending >>>>>>> - endkey >>>>>>> - endkey_docid >>>>>>> - inclusive_end >>>>>>> - startkey >>>>>>> - startkey_docid >>>>>>> - last_key >>>>>>> - update_seq >>>>>>> - timestamp >>>>>>> ``` >>>>>>>=20 >>>>>>> 2) Implement per-endpoint configurable max limits >>>>>>> ``` >>>>>>> _all_docs =3D 5000 >>>>>>> _all_docs/queries =3D 5000 >>>>>>> _all_dbs =3D 5000 >>>>>>> _dbs_info =3D 5000 >>>>>>> _view =3D 2500 >>>>>>> _view/queries =3D 2500 >>>>>>> _find =3D 2500 >>>>>>> ``` >>>>>>>=20 >>>>>>> Latter (after few years) CouchDB would deprecate and remove old >>>>> endpoints. >>>>>>>=20 >>>>>>> Best regards, >>>>>>> iilyak >>>>>>>=20 >>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc = wrote: >>>>>>>> Hello everyone, >>>>>>>>=20 >>>>>>>> I'd like to discuss the shape and behavior of streaming APIs = for >>>>> CouchDB 4.x >>>>>>>>=20 >>>>>>>> By "streaming APIs" I mean APIs which stream data in row as it = gets >>>>>>>> read from the database. These are the endpoints I was thinking = of: >>>>>>>>=20 >>>>>>>> _all_docs, _all_dbs, _dbs_info and query results >>>>>>>>=20 >>>>>>>> I want to focus on what happens when FoundationDB transactions >>>>>>>> time-out after 5 seconds. Currently, all those APIs except = _changes[1] >>>>>>>> feeds, will crash or freeze. The reason is because the >>>>>>>> transaction_too_old error at the end of 5 seconds is retry-able = by >>>>>>>> default, so the request handlers run again and end up shoving = the >>>>>>>> whole request down the socket again, headers and all, which is >>>>>>>> obviously broken and not what we want. >>>>>>>>=20 >>>>>>>> There are few alternatives discussed in couchdb-dev channel. = I'll >>>>>>>> present some behaviors but feel free to add more. Some ideas = might >>>>>>>> have been discounted on the IRC discussion already but I'll = present >>>>>>>> them anyway in case is sparks further conversation: >>>>>>>>=20 >>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and = continue >>>>>>>> streaming the data from the next key after last emitted in the >>>>>>>> previous transaction. Document the API behavior change that it = may >>>>>>>> present a view of the data is never a point-in-time[4] snapshot = of the >>>>>>>> DB. >>>>>>>>=20 >>>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client = libraries >>>>>>>> don't have to change to continue using these CouchDB 4.0 = endpoints >>>>>>>> - This is the easiest to implement since it would re-use the >>>>>>>> implementation for _changes feed (an extra option passed to the = fold >>>>>>>> function). >>>>>>>> - Breaks API behavior if users relied on having a = point-in-time[4] >>>>>>>> snapshot view of the data. >>>>>>>>=20 >>>>>>>> B) Simply end the stream. Let the users pass a = `?transaction=3Dtrue` >>>>>>>> param which indicates they are aware the stream may end early = and so >>>>>>>> would have to paginate from the last emitted key with a skip=3D1.= This >>>>>>>> will keep the request bodies the same as current CouchDB. = However, if >>>>>>>> the users got all the data one request, they will end up = wasting >>>>>>>> another request to see if there is more data available. If they = didn't >>>>>>>> get any data they might have a too large of a skip value (see = [2]) so >>>>>>>> would have to guess different values for start/end keys. Or = impose max >>>>>>>> limit for the `skip` parameter. >>>>>>>>=20 >>>>>>>> C) End the stream and add a final metadata row like a = "transaction": >>>>>>>> "timeout" at the end. That will let the user know to keep = paginating >>>>>>>> from the last key onward. This won't work for `_all_dbs` and >>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like = _changes >>>>>>>> feeds and only use this for views and and _all_docs? If we like = this >>>>>>>> choice, let's think what happens for those as I couldn't come = up with >>>>>>>> anything decent there. >>>>>>>>=20 >>>>>>>> D) Same as C but to solve the issue with skips[2], emit a = bookmark >>>>>>>> "key" of where the iteration stopped and the current "skip" and >>>>>>>> "limit" params, which would keep decreasing. Then user would = pass >>>>>>>> those in "start_key=3D..." in the next request along with the = limit and >>>>>>>> skip params. So something like "continuation":{"skip":599, = "limit":5, >>>>>>>> "key":"..."}. This has the same issue with array results for >>>>>>>> `_all_dbs` and `_dbs_info`[3]. >>>>>>>>=20 >>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum = values >>>>>>>> there such that response time is likely to fit in one = transaction. >>>>>>>> This could be tricky as different runtime environments will = have >>>>>>>> different characteristics. Also, if the timeout happens there = isn't a >>>>>>>> a nice way to send an HTTP error since we already sent the 200 >>>>>>>> response. The downside is that this might break how some users = use the >>>>>>>> API, if say the are using large skips and limits already. = Perhaps here >>>>>>>> we do both B and D, such that if users want transactional = behavior, >>>>>>>> they specify that `transaction=3Dtrue` param and only then we = enforce >>>>>>>> low limit and skip maximums. >>>>>>>>=20 >>>>>>>> F) At least for `_all_docs` it seems providing a point-in-time >>>>>>>> snapshot view doesn't necessarily need to be tied to = transaction >>>>>>>> boundaries. We could check the update sequence of the database = at the >>>>>>>> start of the next transaction and if it hasn't changed we can = continue >>>>>>>> emitting a consistent view. This can apply to C and D and would = just >>>>>>>> determine when the stream ends. If there are no writes = happening to >>>>>>>> the db, this could potential streams all the data just like = option A >>>>>>>> would do. Not entirely sure if this would work for views. >>>>>>>>=20 >>>>>>>> So what do we think? I can see different combinations of = options here, >>>>>>>> maybe even different for each API point. For example = `_all_dbs`, >>>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to = A but >>>>>>>> have parameters to do F, etc. >>>>>>>>=20 >>>>>>>> Cheers, >>>>>>>> -Nick >>>>>>>>=20 >>>>>>>> Some footnotes: >>>>>>>>=20 >>>>>>>> [1] _changes feeds is the only one that works currently. It = behaves as >>>>>>>> per RFC >>>>> = https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-s= eq-index.md#access-patterns >>>>> . >>>>>>>> That is, we continue streaming the data by resetting the = transaction >>>>>>>> object and restarting from the last emitted key (db sequence in = this >>>>>>>> case). However, because the transaction restarts if a document = is >>>>>>>> updated while the streaming take place, it may appear in the = _changes >>>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and = we'd >>>>>>>> have to document it, since previously we presented this = point-in-time >>>>>>>> snapshot of the database from when we started streaming. >>>>>>>>=20 >>>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB = doesn't >>>>>>>> currently support efficient offsets for key selectors >>>>>>>> ( >>>>> = https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-s= electors-for-paging >>>>> ) >>>>>>>> we implemented skip by iterating over the data. This means that = a skip >>>>>>>> of say 100000 could keep timing out the transaction without = yielding >>>>>>>> any data. >>>>>>>>=20 >>>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't = have an >>>>>>>> obvious place to insert a last metadata row. >>>>>>>>=20 >>>>>>>> [4] For example they have a constraint that documents "a" and = "z" >>>>>>>> cannot both be in the database at the same time. But when = iterating >>>>>>>> it's possible that "a" was there at the start. Then by the end, = "a" >>>>>>>> was removed and "z" added, so both "a" and "z" would appear in = the >>>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit = the same >>>>>>>> "relaxed" constrains: >>>>>>>>=20 >>>>> = https://apple.github.io/foundationdb/api-python.html#module-fdb.locality >>>>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20