From dev-return-49261-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Thu Apr 23 18:03:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 445B8180608 for ; Thu, 23 Apr 2020 20:03:04 +0200 (CEST) Received: (qmail 39802 invoked by uid 500); 23 Apr 2020 18:03:03 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 39790 invoked by uid 99); 23 Apr 2020 18:03:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2020 18:03:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E579CC0B16 for ; Thu, 23 Apr 2020 18:03:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.484 X-Spam-Level: * X-Spam-Status: No, score=1.484 tagged_above=-999 required=6.31 tests=[KAM_DMARC_STATUS=0.01, KAM_NUMSUBJECT=0.5, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id NPClvsy3IvAi for ; Thu, 23 Apr 2020 18:02:59 +0000 (UTC) Received-SPF: Softfail (mailfrom) identity=mailfrom; client-ip=204.11.51.157; helo=smtp.justsomehost.net; envelope-from=wohali@apache.org; receiver= Received: from smtp.justsomehost.net (smtp.justsomehost.net [204.11.51.157]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 6A2597F6F9 for ; Thu, 23 Apr 2020 18:02:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp.justsomehost.net (Postfix) with ESMTP id 4BF18580228 for ; Thu, 23 Apr 2020 14:02:51 -0400 (EDT) Received: from smtp.justsomehost.net ([127.0.0.1]) by localhost (smtp.justsomehost.net [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id CZqvK_LwZ-gY for ; Thu, 23 Apr 2020 14:02:50 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by smtp.justsomehost.net (Postfix) with ESMTP id F1B4A58022E for ; Thu, 23 Apr 2020 14:02:49 -0400 (EDT) X-Virus-Scanned: amavisd-new at smtp.justsomehost.net Received: from smtp.justsomehost.net ([127.0.0.1]) by localhost (smtp.justsomehost.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id q5lscEmP4bKJ for ; Thu, 23 Apr 2020 14:02:49 -0400 (EDT) Received: from [192.168.1.14] (toroon0560w-lp130-07-64-229-95-192.dsl.bell.ca [64.229.95.192]) by smtp.justsomehost.net (Postfix) with ESMTPSA id C0FCB580228 for ; Thu, 23 Apr 2020 14:02:49 -0400 (EDT) Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0 To: dev@couchdb.apache.org References: From: Joan Touzet Organization: Apache Software Foundation Message-ID: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org> Date: Thu, 23 Apr 2020 14:02:34 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: quoted-printable I realise this is bikeshedding, but I guess that's kind of the point...=20 Everything below is my opinion, not "fact." It's unfortunate we need a new endpoint for all of this. In a vacuum I=20 might have just suggested we use the semantics we already have, perhaps=20 with ?from=3D instead of ?since=3D . "page" only works if the size of a page is well known, either by server=20 preference or directly in the URL. If I ask for: GET /{db}/_all_docs?limit=3D20&page=3D3 I know that I'm always going to get document 41 through 60 in the=20 default collation order. There's a *fantastic* summary of examples from popular REST APIs here: =20 https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c= 4c We are *pretty close* to what a cursor means in those other examples,=20 except for the fact that our cursor can go stale/invalid after a short ti= me. Bob, could you be a bit more detailed in your explanation how our=20 definition isn't close to these? Or did you mean SQL CURSOR (which is=20 something entirely different?) If so, I'm fine with this being a REST=20 API cursor - something clearly distinct. I come back to wanting to preserve the existing endpoint syntax and=20 naming, without new endpoints, but specifying this new FDB token via=20 ?cursor=3D and this being the trigger for the new behaviour. At some=20 point, we simply stop accepting ?since=3D tokens. This seems inline with=20 other popular REST APIs. -Joan "still sick and not sleeping right" Touzet On 2020-04-23 12:30, Robert Newson wrote: > cursor has established meaning in other databases and ours would not be= very close to them. I don=E2=80=99t think it=E2=80=99s a good idea. >=20 > B. >=20 >> On 23 Apr 2020, at 11:50, Ilya Khlopotov wrote: >> >> =EF=BB=BF >>> >>> The best I could come up with is replacing page with >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs >> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor). >> >>> On 2020/04/23 08:54:36, Garren Smith wrote: >>> I agree with Bob that page doesn't make sense as an endpoint. I'm als= o >>> rubbish with naming. The best I could come up with is replacing page = with >>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs >>> All the fields in the bookmark make sense except timestamp. Why would= it >>> matter if the timestamp is old? What happens if a node's time is an h= our >>> behind another node? >>> >>> >>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov w= rote: >>>> >>>> - page is to provide some notion of progress for user >>>> - timestamp - I was thinking that we should drop requests if user wo= uld >>>> try to pass bookmark created an hour ago. >>>> >>>> On 2020/04/22 21:58:40, Robert Samuel Newson wr= ote: >>>>> "page" and "page number" are odd to me as these don't exist as conc= epts, >>>> I'd rather not invent them. I note there's no mention of page size, = which >>>> makes "page number" very vague. >>>>> >>>>> What is "timestamp" in the bookmark and what effect does it have wh= en >>>> the bookmark is passed back in? >>>>> >>>>> I guess, why does the bookmark include so much extraneous data? Ite= ms >>>> that are not needed to find the fdb key to begin the next response f= rom. >>>>> >>>>> >>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov wrote= : >>>>>> >>>>>> Hello everyone, >>>>>> >>>>>> Based on the discussions on the thread I would like to propose a >>>> number of first steps: >>>>>> 1) introduce new endpoints >>>>>> - {db}/_all_docs/page >>>>>> - {db}/_all_docs/queries/page >>>>>> - _all_dbs/page >>>>>> - _dbs_info/page >>>>>> - {db}/_design/{ddoc}/_view/{view}/page >>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page >>>>>> - {db}/_find/page >>>>>> >>>>>> These new endpoints would act as follows: >>>>>> - don't use delayed responses >>>>>> - return object with following structure >>>>>> ``` >>>>>> { >>>>>> "total": Total, >>>>>> "bookmark": base64 encoded opaque value, >>>>>> "completed": true | false, >>>>>> "update_seq": when available, >>>>>> "page": current page number, >>>>>> "items": [ >>>>>> ] >>>>>> } >>>>>> ``` >>>>>> - the bookmark would include following data (base64 or protobuff??= ?): >>>>>> - direction >>>>>> - page >>>>>> - descending >>>>>> - endkey >>>>>> - endkey_docid >>>>>> - inclusive_end >>>>>> - startkey >>>>>> - startkey_docid >>>>>> - last_key >>>>>> - update_seq >>>>>> - timestamp >>>>>> ``` >>>>>> >>>>>> 2) Implement per-endpoint configurable max limits >>>>>> ``` >>>>>> _all_docs =3D 5000 >>>>>> _all_docs/queries =3D 5000 >>>>>> _all_dbs =3D 5000 >>>>>> _dbs_info =3D 5000 >>>>>> _view =3D 2500 >>>>>> _view/queries =3D 2500 >>>>>> _find =3D 2500 >>>>>> ``` >>>>>> >>>>>> Latter (after few years) CouchDB would deprecate and remove old >>>> endpoints. >>>>>> >>>>>> Best regards, >>>>>> iilyak >>>>>> >>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc wrot= e: >>>>>>> Hello everyone, >>>>>>> >>>>>>> I'd like to discuss the shape and behavior of streaming APIs for >>>> CouchDB 4.x >>>>>>> >>>>>>> By "streaming APIs" I mean APIs which stream data in row as it ge= ts >>>>>>> read from the database. These are the endpoints I was thinking of= : >>>>>>> >>>>>>> _all_docs, _all_dbs, _dbs_info and query results >>>>>>> >>>>>>> I want to focus on what happens when FoundationDB transactions >>>>>>> time-out after 5 seconds. Currently, all those APIs except _chang= es[1] >>>>>>> feeds, will crash or freeze. The reason is because the >>>>>>> transaction_too_old error at the end of 5 seconds is retry-able b= y >>>>>>> default, so the request handlers run again and end up shoving the >>>>>>> whole request down the socket again, headers and all, which is >>>>>>> obviously broken and not what we want. >>>>>>> >>>>>>> There are few alternatives discussed in couchdb-dev channel. I'll >>>>>>> present some behaviors but feel free to add more. Some ideas migh= t >>>>>>> have been discounted on the IRC discussion already but I'll prese= nt >>>>>>> them anyway in case is sparks further conversation: >>>>>>> >>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and cont= inue >>>>>>> streaming the data from the next key after last emitted in the >>>>>>> previous transaction. Document the API behavior change that it ma= y >>>>>>> present a view of the data is never a point-in-time[4] snapshot o= f the >>>>>>> DB. >>>>>>> >>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client libraries >>>>>>> don't have to change to continue using these CouchDB 4.0 endpoint= s >>>>>>> - This is the easiest to implement since it would re-use the >>>>>>> implementation for _changes feed (an extra option passed to the f= old >>>>>>> function). >>>>>>> - Breaks API behavior if users relied on having a point-in-time[4= ] >>>>>>> snapshot view of the data. >>>>>>> >>>>>>> B) Simply end the stream. Let the users pass a `?transaction=3Dtr= ue` >>>>>>> param which indicates they are aware the stream may end early and= so >>>>>>> would have to paginate from the last emitted key with a skip=3D1.= This >>>>>>> will keep the request bodies the same as current CouchDB. However= , if >>>>>>> the users got all the data one request, they will end up wasting >>>>>>> another request to see if there is more data available. If they d= idn't >>>>>>> get any data they might have a too large of a skip value (see [2]= ) so >>>>>>> would have to guess different values for start/end keys. Or impos= e max >>>>>>> limit for the `skip` parameter. >>>>>>> >>>>>>> C) End the stream and add a final metadata row like a "transactio= n": >>>>>>> "timeout" at the end. That will let the user know to keep paginat= ing >>>>>>> from the last key onward. This won't work for `_all_dbs` and >>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like _changes >>>>>>> feeds and only use this for views and and _all_docs? If we like t= his >>>>>>> choice, let's think what happens for those as I couldn't come up = with >>>>>>> anything decent there. >>>>>>> >>>>>>> D) Same as C but to solve the issue with skips[2], emit a bookmar= k >>>>>>> "key" of where the iteration stopped and the current "skip" and >>>>>>> "limit" params, which would keep decreasing. Then user would pass >>>>>>> those in "start_key=3D..." in the next request along with the lim= it and >>>>>>> skip params. So something like "continuation":{"skip":599, "limit= ":5, >>>>>>> "key":"..."}. This has the same issue with array results for >>>>>>> `_all_dbs` and `_dbs_info`[3]. >>>>>>> >>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum val= ues >>>>>>> there such that response time is likely to fit in one transaction= . >>>>>>> This could be tricky as different runtime environments will have >>>>>>> different characteristics. Also, if the timeout happens there isn= 't a >>>>>>> a nice way to send an HTTP error since we already sent the 200 >>>>>>> response. The downside is that this might break how some users us= e the >>>>>>> API, if say the are using large skips and limits already. Perhaps= here >>>>>>> we do both B and D, such that if users want transactional behavio= r, >>>>>>> they specify that `transaction=3Dtrue` param and only then we enf= orce >>>>>>> low limit and skip maximums. >>>>>>> >>>>>>> F) At least for `_all_docs` it seems providing a point-in-time >>>>>>> snapshot view doesn't necessarily need to be tied to transaction >>>>>>> boundaries. We could check the update sequence of the database at= the >>>>>>> start of the next transaction and if it hasn't changed we can con= tinue >>>>>>> emitting a consistent view. This can apply to C and D and would j= ust >>>>>>> determine when the stream ends. If there are no writes happening = to >>>>>>> the db, this could potential streams all the data just like optio= n A >>>>>>> would do. Not entirely sure if this would work for views. >>>>>>> >>>>>>> So what do we think? I can see different combinations of options = here, >>>>>>> maybe even different for each API point. For example `_all_dbs`, >>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to A = but >>>>>>> have parameters to do F, etc. >>>>>>> >>>>>>> Cheers, >>>>>>> -Nick >>>>>>> >>>>>>> Some footnotes: >>>>>>> >>>>>>> [1] _changes feeds is the only one that works currently. It behav= es as >>>>>>> per RFC >>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003= -fdb-seq-index.md#access-patterns >>>> . >>>>>>> That is, we continue streaming the data by resetting the transact= ion >>>>>>> object and restarting from the last emitted key (db sequence in t= his >>>>>>> case). However, because the transaction restarts if a document is >>>>>>> updated while the streaming take place, it may appear in the _cha= nges >>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and w= e'd >>>>>>> have to document it, since previously we presented this point-in-= time >>>>>>> snapshot of the database from when we started streaming. >>>>>>> >>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB does= n't >>>>>>> currently support efficient offsets for key selectors >>>>>>> ( >>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use= -key-selectors-for-paging >>>> ) >>>>>>> we implemented skip by iterating over the data. This means that a= skip >>>>>>> of say 100000 could keep timing out the transaction without yield= ing >>>>>>> any data. >>>>>>> >>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't have= an >>>>>>> obvious place to insert a last metadata row. >>>>>>> >>>>>>> [4] For example they have a constraint that documents "a" and "z" >>>>>>> cannot both be in the database at the same time. But when iterati= ng >>>>>>> it's possible that "a" was there at the start. Then by the end, "= a" >>>>>>> was removed and "z" added, so both "a" and "z" would appear in th= e >>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit the= same >>>>>>> "relaxed" constrains: >>>>>>> >>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.loca= lity >>>>>>> >>>>> >>>>> >>>> >>> >=20