From dev-return-49262-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org  Thu Apr 23 18:43:53 2020
Return-Path: <dev-return-49262-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 0E32E18065D
	for <archive-asf-public@cust-asf.ponee.io>; Thu, 23 Apr 2020 20:43:52 +0200 (CEST)
Received: (qmail 34313 invoked by uid 500); 23 Apr 2020 18:43:52 -0000
Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:dev-help@couchdb.apache.org>
List-Unsubscribe: <mailto:dev-unsubscribe@couchdb.apache.org>
List-Post: <mailto:dev@couchdb.apache.org>
List-Id: <dev.couchdb.apache.org>
Reply-To: dev@couchdb.apache.org
Delivered-To: mailing list dev@couchdb.apache.org
Received: (qmail 34301 invoked by uid 99); 23 Apr 2020 18:43:52 -0000
Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2020 18:43:52 +0000
Received: from auth1-smtp.messagingengine.com (auth1-smtp.messagingengine.com [66.111.4.227])
	by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 1978BF64
	for <dev@couchdb.apache.org>; Thu, 23 Apr 2020 18:43:52 +0000 (UTC)
Received: from compute7.internal (compute7.nyi.internal [10.202.2.47])
	by mailauth.nyi.internal (Postfix) with ESMTP id E3F9D27C0054
	for <dev@couchdb.apache.org>; Thu, 23 Apr 2020 14:43:51 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
  by compute7.internal (MEProxy); Thu, 23 Apr 2020 14:43:51 -0400
X-ME-Sender: <xms:5-GhXmqgbk3eFaEZCs8uChW1n8p_ubQYX0Ki_ovdPLcSvO9RVU60MA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrgeelgdelkecutefuodetggdotefrodftvf
    curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
    uegrihhlohhuthemuceftddtnecuogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenuc
    fjughrpefhtgfgggfuffhfvfgjkffosehtqhhmtdhhtdejnecuhfhrohhmpeftohgsvghr
    thcuufgrmhhuvghlucfpvgifshhonhcuoehrnhgvfihsohhnsegrphgrtghhvgdrohhrgh
    eqnecuffhomhgrihhnpehmvgguihhumhdrtghomhdpghhithhhuhgsrdgtohhmpdhgihht
    hhhusgdrihhonecukfhppeekledrvdefkedrudehgedrvdegudenucevlhhushhtvghruf
    hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrnhgvfihsohhnodhmvghsmhht
    phgruhhthhhpvghrshhonhgrlhhithihqdelfeegvddtvdejvddqudduleegjedtjeejqd
    hrnhgvfihsohhnpeeprghprggthhgvrdhorhhgsehfrghsthhmrghilhdrfhhm
X-ME-Proxy: <xmx:5-GhXinRKoC800nT5WFfC5OcKddbv4BugkbNJy92_ZLp1lOyWVBM4Q>
    <xmx:5-GhXtglY2FFTYmq0yvKyoga6sQ0rZ44R9Jlw0WIylWDvMEGLXWm4w>
    <xmx:5-GhXqeHGgdKaSlM9NW9AgzagmAzFg9Zvx0YFagK-tFoqu1uusUJcQ>
    <xmx:5-GhXjV29R6JIcbVp9R9SMt9oXYsiXnWRaNQ5TzbeiaXbEx68pZT4g>
Received: from [10.200.148.242] (unknown [89.238.154.241])
	by mail.messagingengine.com (Postfix) with ESMTPA id 40F003065D30
	for <dev@couchdb.apache.org>; Thu, 23 Apr 2020 14:43:51 -0400 (EDT)
From: Robert Samuel Newson <rnewson@apache.org>
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0
Date: Thu, 23 Apr 2020 19:43:50 +0100
References: <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-700de13c04766aab597711c8ea53b1975d55b33b@dev.couchdb.apache.org>
 <BB9F65F9-7E8B-49C4-A1F9-7AE81A094EAF@rsn.io>
 <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org>
To: CouchDB Developers <dev@couchdb.apache.org>
In-Reply-To: <30f3e543-4cb8-d20d-21d6-74761b3c156f@apache.org>
Message-Id: <4BA02B48-F3ED-412B-966C-34D594055FE2@apache.org>
X-Mailer: Apple Mail (2.3608.80.23.2.2)


I think it's a key difference from "cursor" as I've seen them elsewhere, =
that ours will point at an ever changing database, you couldn't =
seamlessly cursor through a large data set, one "page" at a time.

Bookmarks began in search (raises guilty hand) in order to address a =
Lucene-specific issue (that high values of "skip" are incredibly =
inefficient, using lots of RAM). That is not true for CouchDB's own =
indexes, which can be navigated perfectly with =
startkey/endkey/startkey_docid/endkey_docid, etc.

I guess I'm not helping much with these observations but I wouldn't like =
to see CouchDB gain an additional and ugly method of doing something =
already possible.

B.

> On 23 Apr 2020, at 19:02, Joan Touzet <wohali@apache.org> wrote:
>=20
> I realise this is bikeshedding, but I guess that's kind of the =
point... Everything below is my opinion, not "fact."
>=20
> It's unfortunate we need a new endpoint for all of this. In a vacuum I =
might have just suggested we use the semantics we already have, perhaps =
with ?from=3D instead of ?since=3D .
>=20
> "page" only works if the size of a page is well known, either by =
server preference or directly in the URL. If I ask for:
>=20
>  GET /{db}/_all_docs?limit=3D20&page=3D3
>=20
> I know that I'm always going to get document 41 through 60 in the =
default collation order.
>=20
> There's a *fantastic* summary of examples from popular REST APIs here:
>=20
> =
https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4=
c
>=20
> We are *pretty close* to what a cursor means in those other examples, =
except for the fact that our cursor can go stale/invalid after a short =
time.
>=20
> Bob, could you be a bit more detailed in your explanation how our =
definition isn't close to these? Or did you mean SQL CURSOR (which is =
something entirely different?) If so, I'm fine with this being a REST =
API cursor - something clearly distinct.
>=20
> I come back to wanting to preserve the existing endpoint syntax and =
naming, without new endpoints, but specifying this new FDB token via =
?cursor=3D and this being the trigger for the new behaviour. At some =
point, we simply stop accepting ?since=3D tokens. This seems inline with =
other popular REST APIs.
>=20
> -Joan "still sick and not sleeping right" Touzet
>=20
>=20
> On 2020-04-23 12:30, Robert Newson wrote:
>> cursor has established meaning in other databases and ours would not =
be very close to them. I don=E2=80=99t think it=E2=80=99s a good idea.
>> B.
>>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iilyak@apache.org> wrote:
>>>=20
>>> =EF=BB=BF
>>>>=20
>>>> The best I could come up with is replacing page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
>>>=20
>>>> On 2020/04/23 08:54:36, Garren Smith <garren@apache.org> wrote:
>>>> I agree with Bob that page doesn't make sense as an endpoint. I'm =
also
>>>> rubbish with naming. The best I could come up with is replacing =
page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>>> All the fields in the bookmark make sense except timestamp. Why =
would it
>>>> matter if the timestamp is old? What happens if a node's time is an =
hour
>>>> behind another node?
>>>>=20
>>>>=20
>>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iilyak@apache.org> =
wrote:
>>>>>=20
>>>>> - page is to provide some notion of progress for user
>>>>> - timestamp - I was thinking that we should drop requests if user =
would
>>>>> try to pass bookmark created an hour ago.
>>>>>=20
>>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnewson@apache.org> =
wrote:
>>>>>> "page" and "page number" are odd to me as these don't exist as =
concepts,
>>>>> I'd rather not invent them. I note there's no mention of page =
size, which
>>>>> makes "page number" very vague.
>>>>>>=20
>>>>>> What is "timestamp" in the bookmark and what effect does it have =
when
>>>>> the bookmark is passed back in?
>>>>>>=20
>>>>>> I guess, why does the bookmark include so much extraneous data? =
Items
>>>>> that are not needed to find the fdb key to begin the next response =
from.
>>>>>>=20
>>>>>>=20
>>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iilyak@apache.org> =
wrote:
>>>>>>>=20
>>>>>>> Hello everyone,
>>>>>>>=20
>>>>>>> Based on the discussions on the thread I would like to propose a
>>>>> number of first steps:
>>>>>>> 1) introduce new endpoints
>>>>>>> - {db}/_all_docs/page
>>>>>>> - {db}/_all_docs/queries/page
>>>>>>> - _all_dbs/page
>>>>>>> - _dbs_info/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
>>>>>>> - {db}/_find/page
>>>>>>>=20
>>>>>>> These new endpoints would act as follows:
>>>>>>> - don't use delayed responses
>>>>>>> - return object with following structure
>>>>>>> ```
>>>>>>> {
>>>>>>>    "total": Total,
>>>>>>>    "bookmark": base64 encoded opaque value,
>>>>>>>    "completed": true | false,
>>>>>>>    "update_seq": when available,
>>>>>>>    "page": current page number,
>>>>>>>    "items": [
>>>>>>>    ]
>>>>>>> }
>>>>>>> ```
>>>>>>> - the bookmark would include following data (base64 or =
protobuff???):
>>>>>>> - direction
>>>>>>> - page
>>>>>>> - descending
>>>>>>> - endkey
>>>>>>> - endkey_docid
>>>>>>> - inclusive_end
>>>>>>> - startkey
>>>>>>> - startkey_docid
>>>>>>> - last_key
>>>>>>> - update_seq
>>>>>>> - timestamp
>>>>>>> ```
>>>>>>>=20
>>>>>>> 2) Implement per-endpoint configurable max limits
>>>>>>> ```
>>>>>>> _all_docs =3D 5000
>>>>>>> _all_docs/queries =3D 5000
>>>>>>> _all_dbs =3D 5000
>>>>>>> _dbs_info =3D 5000
>>>>>>> _view =3D 2500
>>>>>>> _view/queries =3D 2500
>>>>>>> _find =3D 2500
>>>>>>> ```
>>>>>>>=20
>>>>>>> Latter (after few years) CouchDB would deprecate and remove old
>>>>> endpoints.
>>>>>>>=20
>>>>>>> Best regards,
>>>>>>> iilyak
>>>>>>>=20
>>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org> =
wrote:
>>>>>>>> Hello everyone,
>>>>>>>>=20
>>>>>>>> I'd like to discuss the shape and behavior of streaming APIs =
for
>>>>> CouchDB 4.x
>>>>>>>>=20
>>>>>>>> By "streaming APIs" I mean APIs which stream data in row as it =
gets
>>>>>>>> read from the database. These are the endpoints I was thinking =
of:
>>>>>>>>=20
>>>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
>>>>>>>>=20
>>>>>>>> I want to focus on what happens when FoundationDB transactions
>>>>>>>> time-out after 5 seconds. Currently, all those APIs except =
_changes[1]
>>>>>>>> feeds, will crash or freeze. The reason is because the
>>>>>>>> transaction_too_old error at the end of 5 seconds is retry-able =
by
>>>>>>>> default, so the request handlers run again and end up shoving =
the
>>>>>>>> whole request down the socket again, headers and all, which is
>>>>>>>> obviously broken and not what we want.
>>>>>>>>=20
>>>>>>>> There are few alternatives discussed in couchdb-dev channel. =
I'll
>>>>>>>> present some behaviors but feel free to add more. Some ideas =
might
>>>>>>>> have been discounted on the IRC discussion already but I'll =
present
>>>>>>>> them anyway in case is sparks further conversation:
>>>>>>>>=20
>>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and =
continue
>>>>>>>> streaming the data from the next key after last emitted in the
>>>>>>>> previous transaction. Document the API behavior change that it =
may
>>>>>>>> present a view of the data is never a point-in-time[4] snapshot =
of the
>>>>>>>> DB.
>>>>>>>>=20
>>>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client =
libraries
>>>>>>>> don't have to change to continue using these CouchDB 4.0 =
endpoints
>>>>>>>> - This is the easiest to implement since it would re-use the
>>>>>>>> implementation for _changes feed (an extra option passed to the =
fold
>>>>>>>> function).
>>>>>>>> - Breaks API behavior if users relied on having a =
point-in-time[4]
>>>>>>>> snapshot view of the data.
>>>>>>>>=20
>>>>>>>> B) Simply end the stream. Let the users pass a =
`?transaction=3Dtrue`
>>>>>>>> param which indicates they are aware the stream may end early =
and so
>>>>>>>> would have to paginate from the last emitted key with a skip=3D1.=
 This
>>>>>>>> will keep the request bodies the same as current CouchDB. =
However, if
>>>>>>>> the users got all the data one request, they will end up =
wasting
>>>>>>>> another request to see if there is more data available. If they =
didn't
>>>>>>>> get any data they might have a too large of a skip value (see =
[2]) so
>>>>>>>> would have to guess different values for start/end keys. Or =
impose max
>>>>>>>> limit for the `skip` parameter.
>>>>>>>>=20
>>>>>>>> C) End the stream and add a final metadata row like a =
"transaction":
>>>>>>>> "timeout" at the end. That will let the user know to keep =
paginating
>>>>>>>> from the last key onward. This won't work for `_all_dbs` and
>>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like =
_changes
>>>>>>>> feeds and only use this for views and and _all_docs? If we like =
this
>>>>>>>> choice, let's think what happens for those as I couldn't come =
up with
>>>>>>>> anything decent there.
>>>>>>>>=20
>>>>>>>> D) Same as C but to solve the issue with skips[2], emit a =
bookmark
>>>>>>>> "key" of where the iteration stopped and the current "skip" and
>>>>>>>> "limit" params, which would keep decreasing. Then user would =
pass
>>>>>>>> those in "start_key=3D..." in the next request along with the =
limit and
>>>>>>>> skip params. So something like "continuation":{"skip":599, =
"limit":5,
>>>>>>>> "key":"..."}. This has the same issue with array results for
>>>>>>>> `_all_dbs` and `_dbs_info`[3].
>>>>>>>>=20
>>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum =
values
>>>>>>>> there such that response time is likely to fit in one =
transaction.
>>>>>>>> This could be tricky as different runtime environments will =
have
>>>>>>>> different characteristics. Also, if the timeout happens there =
isn't a
>>>>>>>> a nice way to send an HTTP error since we already sent the 200
>>>>>>>> response. The downside is that this might break how some users =
use the
>>>>>>>> API, if say the are using large skips and limits already. =
Perhaps here
>>>>>>>> we do both B and D, such that if users want transactional =
behavior,
>>>>>>>> they specify that `transaction=3Dtrue` param and only then we =
enforce
>>>>>>>> low limit and skip maximums.
>>>>>>>>=20
>>>>>>>> F) At least for `_all_docs` it seems providing a point-in-time
>>>>>>>> snapshot view doesn't necessarily need to be tied to =
transaction
>>>>>>>> boundaries. We could check the update sequence of the database =
at the
>>>>>>>> start of the next transaction and if it hasn't changed we can =
continue
>>>>>>>> emitting a consistent view. This can apply to C and D and would =
just
>>>>>>>> determine when the stream ends. If there are no writes =
happening to
>>>>>>>> the db, this could potential streams all the data just like =
option A
>>>>>>>> would do. Not entirely sure if this would work for views.
>>>>>>>>=20
>>>>>>>> So what do we think? I can see different combinations of =
options here,
>>>>>>>> maybe even different for each API point. For example =
`_all_dbs`,
>>>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to =
A but
>>>>>>>> have parameters to do F, etc.
>>>>>>>>=20
>>>>>>>> Cheers,
>>>>>>>> -Nick
>>>>>>>>=20
>>>>>>>> Some footnotes:
>>>>>>>>=20
>>>>>>>> [1] _changes feeds is the only one that works currently. It =
behaves as
>>>>>>>> per RFC
>>>>> =
https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-s=
eq-index.md#access-patterns
>>>>> .
>>>>>>>> That is, we continue streaming the data by resetting the =
transaction
>>>>>>>> object and restarting from the last emitted key (db sequence in =
this
>>>>>>>> case). However, because the transaction restarts if a document =
is
>>>>>>>> updated while the streaming take place, it may appear in the =
_changes
>>>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and =
we'd
>>>>>>>> have to document it, since previously we presented this =
point-in-time
>>>>>>>> snapshot of the database from when we started streaming.
>>>>>>>>=20
>>>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB =
doesn't
>>>>>>>> currently support efficient offsets for key selectors
>>>>>>>> (
>>>>> =
https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-s=
electors-for-paging
>>>>> )
>>>>>>>> we implemented skip by iterating over the data. This means that =
a skip
>>>>>>>> of say 100000 could keep timing out the transaction without =
yielding
>>>>>>>> any data.
>>>>>>>>=20
>>>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't =
have an
>>>>>>>> obvious place to insert a last metadata row.
>>>>>>>>=20
>>>>>>>> [4] For example they have a constraint that documents "a" and =
"z"
>>>>>>>> cannot both be in the database at the same time. But when =
iterating
>>>>>>>> it's possible that "a" was there at the start. Then by the end, =
"a"
>>>>>>>> was removed and "z" added, so both "a" and "z" would appear in =
the
>>>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit =
the same
>>>>>>>> "relaxed" constrains:
>>>>>>>>=20
>>>>> =
https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
>>>>>>>>=20
>>>>>>=20
>>>>>>=20
>>>>>=20
>>>>=20