From dev-return-49266-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org  Thu Apr 23 22:27:10 2020
Return-Path: <dev-return-49266-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 938A2180608
	for <archive-asf-public@cust-asf.ponee.io>; Fri, 24 Apr 2020 00:27:10 +0200 (CEST)
Received: (qmail 51765 invoked by uid 500); 23 Apr 2020 22:27:09 -0000
Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:dev-help@couchdb.apache.org>
List-Unsubscribe: <mailto:dev-unsubscribe@couchdb.apache.org>
List-Post: <mailto:dev@couchdb.apache.org>
List-Id: <dev.couchdb.apache.org>
Reply-To: dev@couchdb.apache.org
Delivered-To: mailing list dev@couchdb.apache.org
Received: (qmail 51753 invoked by uid 99); 23 Apr 2020 22:27:09 -0000
Received: from ui-eu-02.ponee.io (HELO localhost) (116.202.110.96)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2020 22:27:09 +0000
MIME-Version: 1.0
x-ponymail-agent: PonyMail Composer/0.2
References: <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-db44ec9785673b20b1a9dabf8a9008850db688de@dev.couchdb.apache.org> <CAJd=5HawjqUe1Op9bAq9pKdYMkcnuLtCrcMzM28GR7yS4A-rjg@mail.gmail.com> 
Subject: Re: [DISCUSS] Streaming API in CouchDB 4.0
Message-ID: <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-006110f5788e4ee4d383998fa52e7ec5f587c69c@dev.couchdb.apache.org>
Content-Type: text/plain; charset=utf-8
x-ponymail-sender: 3fd039c72976f9f89629fb5b95d0a929c183add0
From: Ilya Khlopotov <iilyak@apache.org>
X-Mailer: LuaSocket 3.0-rc1
In-Reply-To: <pony-3fd039c72976f9f89629fb5b95d0a929c183add0-db44ec9785673b20b1a9dabf8a9008850db688de@dev.couchdb.apache.org>
Date: Thu, 23 Apr 2020 22:27:08 -0000
To: <dev@couchdb.apache.org>

Hello,

I did an experiment and would like to share the results. So far I implemented only _all_dbs/cursor

Here is how it works (I have only 2 databases)
curl -u adm:pass "http://127.0.0.1:15984/_all_dbs/cursor?limit=1" | jq '.'
{
  "items": [
    "_users"
  ],
  "completed": false,
  "bookmark": "AYNsAAAAA2gCbQAAAAIxMWEBaAJtAAAAATdtAAAAAf9oAm0AAAABNW0AAAAEdGVzdGo"
}

curl -u adm:pass "http://127.0.0.1:15984/_all_dbs/cursor?bookmark=AYNsAAAAA2gCbQAAAAIxMWEBaAJtAAAAATdtAAAAAf9oAm0AAAABNW0AAAAEdGVzdGo" | jq '.'
{
  "items": [
    "test"
  ],
  "completed": true
}

The bookmark "remembers" all initial query parameters passed to the endpoint and it is opaque for end user.

The completion field was tricky to obtain. The solution was to request `Limit + 1` elements from FDB. If FDB returns less than limit it means that there are no more elements and we can set `completed` to `true`.

Alternative names for the endpoint which came to mind:
- page
- cursor
- step
- iterate
- continue

During the experiment I learned that it is impossible to calculate the total and in such case maintaining `page` field wouldn't make sense.

In regards to `update_seq` my thinking was that user would get information when the request was started and after finishing iteration she can request changes since that update_seq with a filter function to return only updated keys in the range. I agree that it is currently not used and we can it when we decide it is worth it.

Best regards,
iilyak

On 2020/04/22 20:18:57, Ilya Khlopotov <iilyak@apache.org> wrote: 
> Hello everyone,
> 
> Based on the discussions on the thread I would like to propose a number of first steps:
> 1) introduce new endpoints
>   - {db}/_all_docs/page
>   - {db}/_all_docs/queries/page
>   - _all_dbs/page
>   - _dbs_info/page
>   - {db}/_design/{ddoc}/_view/{view}/page
>   - {db}/_design/{ddoc}/_view/{view}/queries/page
>   - {db}/_find/page
> 
> These new endpoints would act as follows:
> - don't use delayed responses
> - return object with following structure
>   ```
>   {
>      "total": Total,
>      "bookmark": base64 encoded opaque value,
>      "completed": true | false,
>      "update_seq": when available,
>      "page": current page number,
>      "items": [
>      ]
>   }
>   ```
> - the bookmark would include following data (base64 or protobuff???):
>   - direction
>   - page
>   - descending
>   - endkey
>   - endkey_docid
>   - inclusive_end
>   - startkey
>   - startkey_docid
>   - last_key
>   - update_seq
>   - timestamp
>   ```
> 
> 2) Implement per-endpoint configurable max limits
> ```
> _all_docs = 5000
> _all_docs/queries = 5000
> _all_dbs = 5000
> _dbs_info = 5000
> _view = 2500
> _view/queries = 2500
> _find = 2500
> ```
> 
> Latter (after few years) CouchDB would deprecate and remove old endpoints.
> 
> Best regards,
> iilyak
> 
> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org> wrote: 
> > Hello everyone,
> > 
> > I'd like to discuss the shape and behavior of streaming APIs for CouchDB 4.x
> > 
> > By "streaming APIs" I mean APIs which stream data in row as it gets
> > read from the database. These are the endpoints I was thinking of:
> > 
> >  _all_docs, _all_dbs, _dbs_info  and query results
> > 
> > I want to focus on what happens when FoundationDB transactions
> > time-out after 5 seconds. Currently, all those APIs except _changes[1]
> > feeds, will crash or freeze. The reason is because the
> > transaction_too_old error at the end of 5 seconds is retry-able by
> > default, so the request handlers run again and end up shoving the
> > whole request down the socket again, headers and all, which is
> > obviously broken and not what we want.
> > 
> > There are few alternatives discussed in couchdb-dev channel. I'll
> > present some behaviors but feel free to add more. Some ideas might
> > have been discounted on the IRC discussion already but I'll present
> > them anyway in case is sparks further conversation:
> > 
> > A) Do what _changes[1] feeds do. Start a new transaction and continue
> > streaming the data from the next key after last emitted in the
> > previous transaction. Document the API behavior change that it may
> > present a view of the data is never a point-in-time[4] snapshot of the
> > DB.
> > 
> >  - Keeps the API shape the same as CouchDB <4.0. Client libraries
> > don't have to change to continue using these CouchDB 4.0 endpoints
> >  - This is the easiest to implement since it would re-use the
> > implementation for _changes feed (an extra option passed to the fold
> > function).
> >  - Breaks API behavior if users relied on having a point-in-time[4]
> > snapshot view of the data.
> > 
> > B) Simply end the stream. Let the users pass a `?transaction=true`
> > param which indicates they are aware the stream may end early and so
> > would have to paginate from the last emitted key with a skip=1. This
> > will keep the request bodies the same as current CouchDB. However, if
> > the users got all the data one request, they will end up wasting
> > another request to see if there is more data available. If they didn't
> > get any data they might have a too large of a skip value (see [2]) so
> > would have to guess different values for start/end keys. Or impose max
> > limit for the `skip` parameter.
> > 
> > C) End the stream and add a final metadata row like a "transaction":
> > "timeout" at the end. That will let the user know to keep paginating
> > from the last key onward. This won't work for `_all_dbs` and
> > `_dbs_info`[3] Maybe let those two endpoints behave like _changes
> > feeds and only use this for views and and _all_docs? If we like this
> > choice, let's think what happens for those as I couldn't come up with
> > anything decent there.
> > 
> > D) Same as C but to solve the issue with skips[2], emit a bookmark
> > "key" of where the iteration stopped and the current "skip" and
> > "limit" params, which would keep decreasing. Then user would pass
> > those in "start_key=..." in the next request along with the limit and
> > skip params. So something like "continuation":{"skip":599, "limit":5,
> > "key":"..."}. This has the same issue with array results for
> > `_all_dbs` and `_dbs_info`[3].
> > 
> > E) Enforce low `limit` and `skip` parameters. Enforce maximum values
> > there such that response time is likely to fit in one transaction.
> > This could be tricky as different runtime environments will have
> > different characteristics. Also, if the timeout happens there isn't a
> > a nice way to send an HTTP error since we already sent the 200
> > response. The downside is that this might break how some users use the
> > API, if say the are using large skips and limits already. Perhaps here
> > we do both B and D, such that if users want transactional behavior,
> > they specify that `transaction=true` param and only then we enforce
> > low limit and skip maximums.
> > 
> > F) At least for `_all_docs` it seems providing a point-in-time
> > snapshot view doesn't necessarily need to be tied to transaction
> > boundaries. We could check the update sequence of the database at the
> > start of the next transaction and if it hasn't changed we can continue
> > emitting a consistent view. This can apply to C and D and would just
> > determine when the stream ends. If there are no writes happening to
> > the db, this could potential streams all the data just like option A
> > would do. Not entirely sure if this would work for views.
> > 
> > So what do we think? I can see different combinations of options here,
> > maybe even different for each API point. For example `_all_dbs`,
> > `_dbs_info` are always A, and `_all_docs` and views default to A but
> > have parameters to do F, etc.
> > 
> > Cheers,
> > -Nick
> > 
> > Some footnotes:
> > 
> > [1] _changes feeds is the only one that works currently. It behaves as
> > per RFC https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns.
> > That is, we continue streaming the data by resetting the transaction
> > object and restarting from the last emitted key (db sequence in this
> > case). However, because the transaction restarts if a document is
> > updated while the streaming take place, it may appear in the _changes
> > feed twice. That's a behavior difference from CouchDB < 4.0 and we'd
> > have to document it, since previously we presented this point-in-time
> > snapshot of the database from when we started streaming.
> > 
> > [2] Our streaming APIs have both skips and limits. Since FDB doesn't
> > currently support efficient offsets for key selectors
> > (https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging)
> > we implemented skip by iterating over the data. This means that a skip
> > of say 100000 could keep timing out the transaction without yielding
> > any data.
> > 
> > [3] _all_dbs and _dbs_info return a JSON array so they don't have an
> > obvious place to insert a last metadata row.
> > 
> > [4] For example they have a constraint that documents "a" and "z"
> > cannot both be in the database at the same time. But when iterating
> > it's possible that "a" was there at the start. Then by the end, "a"
> > was removed and "z" added, so both "a" and "z" would appear in the
> > emitted stream. Note that FoundationDB has APIs which exhibit the same
> > "relaxed" constrains:
> > https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
> > 
>