couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Khlopotov <iil...@apache.org>
Subject Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts
Date Mon, 11 Feb 2019 18:48:24 GMT


On 2019/02/11 16:04:45, Adam Kocoloski <kocolosk@apache.org> wrote: 
> Thanks Ilya. Some comments:
> 
> > - `{NS} / {docid} / _info` = '{"scheme": {scheme_name} / {scheme_revision}, "revision":
{revision}}' 
> > - `{NS} / {docid} / _data / {compressed_json_path} = latest_value | part`
> > - `{NS} / {docid} / {revision} / _info` = '{"scheme": {scheme_name} / {scheme_revision}}'
> > - `{NS} / {docid} / {revision} / _data / {compressed_json_path} = value | part`
> 
> Is this meant to distinguish between “winning” and “losing” revisions? What I
dislike about that distinction is that an update to the currently “losing” branch could
promote it to “winning” if it becomes the longest branch. So now you need to go and grab
the entire revision tree to know whether the update ought to be written into the {docid} /
_data space or not.

Indeed, determining  whether the update ought to be written into the {docid} / _data space
or not is a problem.
> 
> > ## Read latest revision
> 
> > - We do range read "{NS} / {docid}" and assemble documents using results of the
query. 
> 
> 
> In the model you outlined this means you’re reading the entire revision tree and the
body of every revision, when all you really wanted was the body of the winning revision. That’s
a lot of unnecessary data transfer that I thought we were trying hard to avoid.
You are right we should read only following two ranges
- `{NS} / {docid} / _info`  
- `{NS} / {docid} / _data 

> 
> > ## Write 
> > ...
> > - read `{NS} / {docid} / _info`, verify that revision is equal to specified parent_revision
and add the key into write conflict range
> 
> I think you meant _index here, not _info. 
I meant _info, but the name doesn't really matter here. 

> 
> >  - `{NS} / {docid} / _index / _revs / {is_deleted} / {rev_pos} / {new_revision}
= {parent_revision}`
> 
> Are you suggesting that every edit creates a new KV in the _revs space? How would you
tell which ones are leafs? How would you do stemming of branches?
> 
> I had proposed to have the whole contents of the edit branch as a single KV. When updating
a branch, you retrieve that branch KV, clear it, and write a new one with the stemmed content
of the ancestry into a key that includes the new leaf revision. This also means you don’t
explicitly need to add the old key into the write conflict range (because you’re clearing
it, which will do that implicitly).
> 
touche

> The path compression and _changes watcher stuff is being discussed in other threads so
I’ll avoid any commentary here. Cheers,
> 
> Adam
> 
> > On Feb 8, 2019, at 7:48 PM, Ilya Khlopotov <iilyak@apache.org> wrote:
> > 
> > # Data model without support for per key revisions
> > 
> > In this model "per key revisions" support was sacrificed so we can avoid doing read
of previous revision of the document when we write new version of it. 
> > 
> > # Ranges used in the model
> > 
> > - `{NS} / _mapping / _last_field_id
> > - `{NS} / _mapping / _by_field / {field_name} = field_id` # we would cache it in
Layer's memory
> > - `{NS} / _mapping / _by_field_id / {field_id} = field_name` # we would cache it
in Layer's memory
> > - `{NS} / {docid} / _info` = '{"scheme": {scheme_name} / {scheme_revision}, "revision":
{revision}}' 
> > - `{NS} / {docid} / _data / {compressed_json_path} = latest_value | part`
> > - `{NS} / {docid} / {revision} / _info` = '{"scheme": {scheme_name} / {scheme_revision}}'
> > - `{NS} / {docid} / {revision} / _data / {compressed_json_path} = value | part`
> > - `{NS} / {docid} / _index / _revs / {is_deleted} / {rev_pos} / {revision} = {parent_revision}`
> > - `{NS} / _index / _by_seq / {seq}` = "{docid} / {revision}" # seq is a FDB versionstamp
> > 
> > We would have few special documents:
> > - "_schema / {schema_name}" - this doc would contain validation rules for schema
(not used in MVP).
> > - when we start using schema we would be able to populate `{NS} / _mapping / xxx`
range when we write schema document
> > - the schema document MUST fit into 100K (we don't use flatten JSON model for it)
> > 
> > # JSON path compression
> > 
> > - Assign integer field_id to every unique field_name of a JSON document starting
from 10.
> > - We would use first 10 integers to encode type of the value:
> >  - 0 - the value is an array
> >  - 1 - the value is a big scalar value broken down into multiple parts 
> >  - 2..10 -- reserved for future use
> > - Replace field names in JSON path with field IDs
> > 
> > ## Example of compressed JSON 
> > ```
> > {
> >    foo: {
> >        bar: {
> >          baz: [1, 2, 3]
> >        },
> >        langs: {
> >           "en_US": "English",
> >           "en_UK": "English (UK)" 
> >           "en_CA": "English (Canada)",
> >           "zh_CN": "Chinese (China)" 
> >        },
> >        translations: {
> >           "en_US": {
> >               "license": "200 Kb of text"
> >           }
> >        }
> >    }
> > }
> > ```
> > this document would be compressed into
> > ```
> > # written in separate transaction and cached in the Layer
> > {NS} / _mapping / _by_field / foo = 10
> > {NS} / _mapping / _by_field / bar = 12
> > {NS} / _mapping / _by_field / baz = 11
> > {NS} / _mapping / _by_field / langs = 18
> > {NS} / _mapping / _by_field / en_US = 13
> > {NS} / _mapping / _by_field / en_UK = 14
> > {NS} / _mapping / _by_field / en_CA = 15
> > {NS} / _mapping / _by_field / zh_CN = 16
> > {NS} / _mapping / _by_field / translations = 17
> > {NS} / _mapping / _by_field / license = 19
> > {NS} / _mapping / _by_field_id / 10 = foo
> > {NS} / _mapping / _by_field_id / 12 = bar
> > {NS} / _mapping / _by_field_id / 11 = baz
> > {NS} / _mapping / _by_field_id  / 18 = langs
> > {NS} / _mapping / _by_field_id  / 13 = en_US
> > {NS} / _mapping / _by_field_id  / 14 = en_UK
> > {NS} / _mapping / _by_field_id  / 15 = en_CA
> > {NS} / _mapping / _by_field_id  / 16 = zh_CN
> > {NS} / _mapping / _by_field_id  / 17 = translations
> > {NS} / _mapping / _by_field_id  / 19 = license
> > 
> > # written on document write
> > {NS} / {docid} / _data / 10 /12 / 11 / 0 / 0 = 1
> > {NS} / {docid} / _data / 10 /12 / 11 / 0 / 1 = 2
> > {NS} / {docid} / _data / 10 /12 / 11 / 0 / 2 = 3
> > {NS} / {docid} / _data / 10 / 18 / 13 = English
> > {NS} / {docid} / _data / 10 / 18 / 14 = English (UK)
> > {NS} / {docid} / _data / 10 / 18 / 15 = English (Canada)
> > {NS} / {docid} / _data / 10 / 18 / 16 = Chinese (China)
> > {NS} / {docid} / _data / 10 / 17 / 13 / 19 / 1 / 0 = first 100K of license
> > {NS} / {docid} / _data / 10 / 17 / 13 / 19 / 1 / 1 = second 100K of license
> > ```
> > 
> > # Operations
> > 
> > 
> > ## Read latest revision
> > 
> > - We do range read "{NS} / {docid}" and assemble documents using results of the
query. 
> > - If we cannot find field_id in Layer's cache we would read "{NS} / _mapping / _by_field_id
" range and cache the result.
> > 
> > ## Read specified revision
> > 
> > - Do a range read "`{NS} / {docid} / {revision} /" and assemble document using result
of the query
> > - If we cannot find field_id in Layer's cache we would read "{NS} / _mapping / _by_field_id
" range and cache the result.
> > 
> > ## Write 
> > 
> > - flatten JSON
> > - check if we there are missing fields in field cache of the Layer
> > - if the keys are missing start key allocation transaction 
> >  - read "{NS} / _mapping / _by_field / {field_name}"
> >    - if it doesn't exists add key to the write conflict range (the FDB would do
it by default)
> >  - `field_idx = txn["{NS} / _mapping / _last_field_id"] + 1` and add it to the write
conflict range (the FDB would do it by default)
> >  - write `"{NS} / _mapping / _last_field_id" = field_idx`
> >  - write `"{NS} / _mapping / _by_field / {field_name}" = field_idx` 
> >  - write `"{NS} / _mapping / _by_field_id / {field_idx}" = field_name` 
> > - read `{NS} / {docid} / _info`, verify that revision is equal to specified parent_revision
and add the key into write conflict range
> > - generate new_revision
> > - write all fields into two ranges (split big values as needed)
> >   - "{NS} / {docid} / _data / {compressed_json_path}"
> >   - "{NS} / {docid} / {new_revision} / _data / {compressed_json_path}"
> > - write into following keys
> >  - `{NS} / {docid} / _info` = '{"scheme": {scheme_name} / {scheme_revision}, "revision":
{revision}}' 
> >  - `{NS} / {docid} / {new_revision} / _info` = '{"scheme": {scheme_name} / {scheme_revision}}'
> >  - `{NS} / {docid} / _index / _revs / {is_deleted} / {rev_pos} / {new_revision}
= {parent_revision}`
> >  - `{NS} / _index / _by_seq / {seq}` = "{docid} / {revision}" # seq is a FDB versionstamp
> > - update database stats
> >  - `{NS} / _meta / number_of_docs` += 1
> >  - `{NS} / _meta / external_size` += external_size
> > 
> > ## Get list of all known revisions for the document
> > 
> > - range query `{NS} / {docid} / _index / _revs /`
> > 
> > ## Changes feed
> > 
> > - we would set a watch for `{NS} / _meta / external_size` key
> > - when watch is fired we would do a range query starting from "{NS} / _index / _by_seq
/ {since_seq}"
> > - remember last key returned by range query to set a new value for since_seq
> > 
> > best regards,
> > iilyak
> > On 2019/02/04 19:25:13, Ilya Khlopotov <iilyak@apache.org> wrote: 
> >> This is a beginning of a discussion thread about storage of edit conflicts and
everything which relates to revisions.
> >> 
> >> 
> >> 
> 
> 

Mime
View raw message