couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Samuel Newson <rnew...@apache.org>
Subject Re: [DISCUSS] Streaming API in CouchDB 4.0
Date Thu, 23 Apr 2020 18:43:50 GMT

I think it's a key difference from "cursor" as I've seen them elsewhere, that ours will point
at an ever changing database, you couldn't seamlessly cursor through a large data set, one
"page" at a time.

Bookmarks began in search (raises guilty hand) in order to address a Lucene-specific issue
(that high values of "skip" are incredibly inefficient, using lots of RAM). That is not true
for CouchDB's own indexes, which can be navigated perfectly with startkey/endkey/startkey_docid/endkey_docid,
etc.

I guess I'm not helping much with these observations but I wouldn't like to see CouchDB gain
an additional and ugly method of doing something already possible.

B.

> On 23 Apr 2020, at 19:02, Joan Touzet <wohali@apache.org> wrote:
> 
> I realise this is bikeshedding, but I guess that's kind of the point... Everything below
is my opinion, not "fact."
> 
> It's unfortunate we need a new endpoint for all of this. In a vacuum I might have just
suggested we use the semantics we already have, perhaps with ?from= instead of ?since= .
> 
> "page" only works if the size of a page is well known, either by server preference or
directly in the URL. If I ask for:
> 
>  GET /{db}/_all_docs?limit=20&page=3
> 
> I know that I'm always going to get document 41 through 60 in the default collation order.
> 
> There's a *fantastic* summary of examples from popular REST APIs here:
> 
> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c
> 
> We are *pretty close* to what a cursor means in those other examples, except for the
fact that our cursor can go stale/invalid after a short time.
> 
> Bob, could you be a bit more detailed in your explanation how our definition isn't close
to these? Or did you mean SQL CURSOR (which is something entirely different?) If so, I'm fine
with this being a REST API cursor - something clearly distinct.
> 
> I come back to wanting to preserve the existing endpoint syntax and naming, without new
endpoints, but specifying this new FDB token via ?cursor= and this being the trigger for the
new behaviour. At some point, we simply stop accepting ?since= tokens. This seems inline with
other popular REST APIs.
> 
> -Joan "still sick and not sleeping right" Touzet
> 
> 
> On 2020-04-23 12:30, Robert Newson wrote:
>> cursor has established meaning in other databases and ours would not be very close
to them. I don’t think it’s a good idea.
>> B.
>>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iilyak@apache.org> wrote:
>>> 
>>> 
>>>> 
>>>> The best I could come up with is replacing page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
>>> 
>>>> On 2020/04/23 08:54:36, Garren Smith <garren@apache.org> wrote:
>>>> I agree with Bob that page doesn't make sense as an endpoint. I'm also
>>>> rubbish with naming. The best I could come up with is replacing page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>>> All the fields in the bookmark make sense except timestamp. Why would it
>>>> matter if the timestamp is old? What happens if a node's time is an hour
>>>> behind another node?
>>>> 
>>>> 
>>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iilyak@apache.org>
wrote:
>>>>> 
>>>>> - page is to provide some notion of progress for user
>>>>> - timestamp - I was thinking that we should drop requests if user would
>>>>> try to pass bookmark created an hour ago.
>>>>> 
>>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnewson@apache.org>
wrote:
>>>>>> "page" and "page number" are odd to me as these don't exist as concepts,
>>>>> I'd rather not invent them. I note there's no mention of page size, which
>>>>> makes "page number" very vague.
>>>>>> 
>>>>>> What is "timestamp" in the bookmark and what effect does it have
when
>>>>> the bookmark is passed back in?
>>>>>> 
>>>>>> I guess, why does the bookmark include so much extraneous data? Items
>>>>> that are not needed to find the fdb key to begin the next response from.
>>>>>> 
>>>>>> 
>>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iilyak@apache.org>
wrote:
>>>>>>> 
>>>>>>> Hello everyone,
>>>>>>> 
>>>>>>> Based on the discussions on the thread I would like to propose
a
>>>>> number of first steps:
>>>>>>> 1) introduce new endpoints
>>>>>>> - {db}/_all_docs/page
>>>>>>> - {db}/_all_docs/queries/page
>>>>>>> - _all_dbs/page
>>>>>>> - _dbs_info/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
>>>>>>> - {db}/_find/page
>>>>>>> 
>>>>>>> These new endpoints would act as follows:
>>>>>>> - don't use delayed responses
>>>>>>> - return object with following structure
>>>>>>> ```
>>>>>>> {
>>>>>>>    "total": Total,
>>>>>>>    "bookmark": base64 encoded opaque value,
>>>>>>>    "completed": true | false,
>>>>>>>    "update_seq": when available,
>>>>>>>    "page": current page number,
>>>>>>>    "items": [
>>>>>>>    ]
>>>>>>> }
>>>>>>> ```
>>>>>>> - the bookmark would include following data (base64 or protobuff???):
>>>>>>> - direction
>>>>>>> - page
>>>>>>> - descending
>>>>>>> - endkey
>>>>>>> - endkey_docid
>>>>>>> - inclusive_end
>>>>>>> - startkey
>>>>>>> - startkey_docid
>>>>>>> - last_key
>>>>>>> - update_seq
>>>>>>> - timestamp
>>>>>>> ```
>>>>>>> 
>>>>>>> 2) Implement per-endpoint configurable max limits
>>>>>>> ```
>>>>>>> _all_docs = 5000
>>>>>>> _all_docs/queries = 5000
>>>>>>> _all_dbs = 5000
>>>>>>> _dbs_info = 5000
>>>>>>> _view = 2500
>>>>>>> _view/queries = 2500
>>>>>>> _find = 2500
>>>>>>> ```
>>>>>>> 
>>>>>>> Latter (after few years) CouchDB would deprecate and remove old
>>>>> endpoints.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> iilyak
>>>>>>> 
>>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatamane@apache.org>
wrote:
>>>>>>>> Hello everyone,
>>>>>>>> 
>>>>>>>> I'd like to discuss the shape and behavior of streaming APIs
for
>>>>> CouchDB 4.x
>>>>>>>> 
>>>>>>>> By "streaming APIs" I mean APIs which stream data in row
as it gets
>>>>>>>> read from the database. These are the endpoints I was thinking
of:
>>>>>>>> 
>>>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
>>>>>>>> 
>>>>>>>> I want to focus on what happens when FoundationDB transactions
>>>>>>>> time-out after 5 seconds. Currently, all those APIs except
_changes[1]
>>>>>>>> feeds, will crash or freeze. The reason is because the
>>>>>>>> transaction_too_old error at the end of 5 seconds is retry-able
by
>>>>>>>> default, so the request handlers run again and end up shoving
the
>>>>>>>> whole request down the socket again, headers and all, which
is
>>>>>>>> obviously broken and not what we want.
>>>>>>>> 
>>>>>>>> There are few alternatives discussed in couchdb-dev channel.
I'll
>>>>>>>> present some behaviors but feel free to add more. Some ideas
might
>>>>>>>> have been discounted on the IRC discussion already but I'll
present
>>>>>>>> them anyway in case is sparks further conversation:
>>>>>>>> 
>>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction
and continue
>>>>>>>> streaming the data from the next key after last emitted in
the
>>>>>>>> previous transaction. Document the API behavior change that
it may
>>>>>>>> present a view of the data is never a point-in-time[4] snapshot
of the
>>>>>>>> DB.
>>>>>>>> 
>>>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client
libraries
>>>>>>>> don't have to change to continue using these CouchDB 4.0
endpoints
>>>>>>>> - This is the easiest to implement since it would re-use
the
>>>>>>>> implementation for _changes feed (an extra option passed
to the fold
>>>>>>>> function).
>>>>>>>> - Breaks API behavior if users relied on having a point-in-time[4]
>>>>>>>> snapshot view of the data.
>>>>>>>> 
>>>>>>>> B) Simply end the stream. Let the users pass a `?transaction=true`
>>>>>>>> param which indicates they are aware the stream may end early
and so
>>>>>>>> would have to paginate from the last emitted key with a skip=1.
This
>>>>>>>> will keep the request bodies the same as current CouchDB.
However, if
>>>>>>>> the users got all the data one request, they will end up
wasting
>>>>>>>> another request to see if there is more data available. If
they didn't
>>>>>>>> get any data they might have a too large of a skip value
(see [2]) so
>>>>>>>> would have to guess different values for start/end keys.
Or impose max
>>>>>>>> limit for the `skip` parameter.
>>>>>>>> 
>>>>>>>> C) End the stream and add a final metadata row like a "transaction":
>>>>>>>> "timeout" at the end. That will let the user know to keep
paginating
>>>>>>>> from the last key onward. This won't work for `_all_dbs`
and
>>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like
_changes
>>>>>>>> feeds and only use this for views and and _all_docs? If we
like this
>>>>>>>> choice, let's think what happens for those as I couldn't
come up with
>>>>>>>> anything decent there.
>>>>>>>> 
>>>>>>>> D) Same as C but to solve the issue with skips[2], emit a
bookmark
>>>>>>>> "key" of where the iteration stopped and the current "skip"
and
>>>>>>>> "limit" params, which would keep decreasing. Then user would
pass
>>>>>>>> those in "start_key=..." in the next request along with the
limit and
>>>>>>>> skip params. So something like "continuation":{"skip":599,
"limit":5,
>>>>>>>> "key":"..."}. This has the same issue with array results
for
>>>>>>>> `_all_dbs` and `_dbs_info`[3].
>>>>>>>> 
>>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum
values
>>>>>>>> there such that response time is likely to fit in one transaction.
>>>>>>>> This could be tricky as different runtime environments will
have
>>>>>>>> different characteristics. Also, if the timeout happens there
isn't a
>>>>>>>> a nice way to send an HTTP error since we already sent the
200
>>>>>>>> response. The downside is that this might break how some
users use the
>>>>>>>> API, if say the are using large skips and limits already.
Perhaps here
>>>>>>>> we do both B and D, such that if users want transactional
behavior,
>>>>>>>> they specify that `transaction=true` param and only then
we enforce
>>>>>>>> low limit and skip maximums.
>>>>>>>> 
>>>>>>>> F) At least for `_all_docs` it seems providing a point-in-time
>>>>>>>> snapshot view doesn't necessarily need to be tied to transaction
>>>>>>>> boundaries. We could check the update sequence of the database
at the
>>>>>>>> start of the next transaction and if it hasn't changed we
can continue
>>>>>>>> emitting a consistent view. This can apply to C and D and
would just
>>>>>>>> determine when the stream ends. If there are no writes happening
to
>>>>>>>> the db, this could potential streams all the data just like
option A
>>>>>>>> would do. Not entirely sure if this would work for views.
>>>>>>>> 
>>>>>>>> So what do we think? I can see different combinations of
options here,
>>>>>>>> maybe even different for each API point. For example `_all_dbs`,
>>>>>>>> `_dbs_info` are always A, and `_all_docs` and views default
to A but
>>>>>>>> have parameters to do F, etc.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> -Nick
>>>>>>>> 
>>>>>>>> Some footnotes:
>>>>>>>> 
>>>>>>>> [1] _changes feeds is the only one that works currently.
It behaves as
>>>>>>>> per RFC
>>>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns
>>>>> .
>>>>>>>> That is, we continue streaming the data by resetting the
transaction
>>>>>>>> object and restarting from the last emitted key (db sequence
in this
>>>>>>>> case). However, because the transaction restarts if a document
is
>>>>>>>> updated while the streaming take place, it may appear in
the _changes
>>>>>>>> feed twice. That's a behavior difference from CouchDB <
4.0 and we'd
>>>>>>>> have to document it, since previously we presented this point-in-time
>>>>>>>> snapshot of the database from when we started streaming.
>>>>>>>> 
>>>>>>>> [2] Our streaming APIs have both skips and limits. Since
FDB doesn't
>>>>>>>> currently support efficient offsets for key selectors
>>>>>>>> (
>>>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging
>>>>> )
>>>>>>>> we implemented skip by iterating over the data. This means
that a skip
>>>>>>>> of say 100000 could keep timing out the transaction without
yielding
>>>>>>>> any data.
>>>>>>>> 
>>>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't
have an
>>>>>>>> obvious place to insert a last metadata row.
>>>>>>>> 
>>>>>>>> [4] For example they have a constraint that documents "a"
and "z"
>>>>>>>> cannot both be in the database at the same time. But when
iterating
>>>>>>>> it's possible that "a" was there at the start. Then by the
end, "a"
>>>>>>>> was removed and "z" added, so both "a" and "z" would appear
in the
>>>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit
the same
>>>>>>>> "relaxed" constrains:
>>>>>>>> 
>>>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 


Mime
View raw message