couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts
Date Mon, 11 Feb 2019 17:22:06 GMT
Oh, for sure we can do that. I was just trying to think of a clever
way that replicated edits could also find their edit branch with a
single read as well instead of having to pull out the entire tree.

On Mon, Feb 11, 2019 at 9:04 AM Adam Kocoloski <kocolosk@apache.org> wrote:
>
> Not sure I follow. If a transaction needs the full revision tree for a single document
it can retrieve that with a single range read for the (“_meta”, DocID) prefix.
>
> Adam
>
> > On Feb 8, 2019, at 6:35 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> >
> > Ah, that all sounds good. The only thing I'm not initially seeing as
> > obvious is how we lookup a revision path to extend during replication
> > when the previous revision may be anywhere in the list of $revs_limit
> > revisions. Feels like there might be some sort of trickery to do that
> > efficiently. Although it may also be good enough to also just issue
> > $revs_limit lookups in parallel given that we're maxed out on either
> > $revs_limit or 2*$revs_limit if we have to check for both deleted and
> > not.
> >
> > On Fri, Feb 8, 2019 at 10:22 AM Adam Kocoloski <kocolosk@apache.org> wrote:
> >>
> >> Totally unacceptable! ;)  In fact some key bits of that model got dispersed
into at least two separate emails so you’re likely not the only one. I’ll restate here:
> >>
> >> The size limits in FoundationDB preclude us from storing the entire key tree
as a single value; in pathological situations the tree could exceed 100KB. Rather, I think
it would make sense to store each edit *branch* as a separate KV. We stem the branch long
before it hits the value size limit, and in the happy case of no edit conflicts this means
we store the edit history metadata in a single KV. It also means that we can apply an interactive
edit without retrieving the entire conflicted revision tree; we need only retrieve and modify
the single branch against which the edit is being applied. The downside is that we may duplicate
historical revision identifiers shared by multiple edit branches, but I think this is a worthwhile
tradeoff.
> >>
> >> I’d also ensure that a document update only needs to read the edit branch
KV against which the update is being applied, and it can read that branch immediately knowing
only the content of the edit that is being attempted (i.e., it does not need to read the current
version of the document itself).
> >>
> >> I think we achieve these goals with a separate subspace (maybe “_meta”?)
for the revision trees, with keys and values that look like
> >>
> >> (“_meta”, DocID, NotDeleted, RevPosition, RevHash) = (VersionstampForCurrentRev,
[ParentRev, GrandparentRev, …])
> >>
> >> Some notes:
> >>
> >> - including IsDeleted ensures that we can efficiently accept the case where
we upload a new document with the same ID where all previous edit branches have been deleted;
i.e. we can construct a key selector which automatically tells us there are no deleted=false
edit branches
> >> - access to VersionstampForCurrentRev ensures we can clear the old entry in
the by_seq space during the update
> >> - i need to remind myself how we handle an edit attempt which supplies a _rev
representing a deleted leaf. Do we fail that as a conflict? That would be the natural thing
to do here, otherwise we’re forced to check both deleted=false and deleted=true keys
> >> - the keys can be made to naturally sort so that the winning revision sorts
last, but I don’t believe that’s a requirement here like it is for the actual document
data space
> >>
> >> Cheers, Adam
> >>
> >>> On Feb 8, 2019, at 8:59 AM, Paul Davis <paul.joseph.davis@gmail.com>
wrote:
> >>>
> >>>> I’m relatively happy with the revision history data model at this
point.
> >>>
> >>> I forgot to make a note, but which of the various models are you
> >>> referring to by "revision history data model". There's been so many
> >>> without firm names that my brain is having a hard time parsing that
> >>> one.
> >>>
> >>> On Thu, Feb 7, 2019 at 9:35 PM Adam Kocoloski <kocolosk@apache.org>
wrote:
> >>>>
> >>>> Bob, Garren, Jan - heard you loud and clear, K.I.S.S. I do think it’s
a bit “simplistic" to exclusively choose simplicity over performance and storage density.
We’re (re)building a database here, one that has some users with pretty demanding performance
and scalability requirements. And yes, we should certainly be testing and measuring. Kyle
and team are setting up infrastructure in IBM land to help with that now, but I also believe
we can design the data model and architecture with a basic performance model of FoundationDB
in mind:
> >>>>
> >>>> - reads cost 1ms
> >>>> - short range reads are the same cost as a single lookup
> >>>> - reads of independent parts of the keyspace can be parallelized for
cheap
> >>>> - writes are zero-cost until commit time
> >>>>
> >>>> We ought to be able to use these assumptions to drive some decisions
about data models ahead of any end-to-end performance test.
> >>>>
> >>>> If there are specific elements of the edit conflicts management where
you think greater simplicity is warranted, let’s get those called out. Ilya noted (correctly,
in my opinion) that the term sharing stuff is one of those items. It’s relatively complex,
potentially a performance hit, and only saves on storage density in the corner case of lots
of edit conflicts. That’s a good one to drop.
> >>>>
> >>>> I’m relatively happy with the revision history data model at this
point. Hopefully folks find it easy to grok, and it’s efficient for both reads and writes.
It costs some extra storage for conflict revisions compared to the current tree representation
(up to 16K per edit branch, with default _revs_limit) but knowing what we know about the performance
death spiral for wide revision trees today I’ll happily make a storage vs. performance tradeoff
here :)
> >>>>
> >>>> Setting the shared term approach aside, I’ve still been mulling over
the key structure for the actual document data:
> >>>>
> >>>> -  I thought about trying to construct a special _conflicts subspace,
but I don’t like that approach because the choice of a “winning" revision can flip back
and forth very quickly with concurrent writers to different edit branches. I think we really
want to have a way for revisions to naturally sort themselves so the winner is the first or
last revision in a list.
> >>>>
> >>>> - Assuming we’re using key paths of the form (docid, revision-ish,
path, to, field), the goal here is to find an efficient way to get the last key with prefix
“docid” (assuming winner sorts last), and then all the keys that share the same (docid,
revision-ish) prefix as that one. I see two possible approaches so far, neither perfect:
> >>>>
> >>>> Option 1: Execute a get_key() operation with a key selector that asks
for the last key less than “docid\xFF” (again assuming winner sorts last), and then do
a get_range_startswith() request setting the streaming mode to “want_all” and the prefix
to the docid plus whatever revision-ish we found from the get_key() request. This is two roundtrips
instead of one, but it always retrieves exactly the right set of keys, and the second step
is executed as fast as possible.
> >>>>
> >>>> Option 2: Jump straight to get_range_startswith() request using only
“docid” as the prefix, then cancel the iteration once we reach a revision not equal to
the first one we see. We might transfer too much data, or we might end up doing multiple roundtrips
if the default “iterator” streaming mode sends too little data to start (I haven’t checked
what the default iteration block is there), but in the typical case of zero edit conflicts
we have a good chance of retrieving the full document in one roundtrip.
> >>>>
> >>>> I don’t have a good sense of which option wins out here from a performance
perspective, but they’re both operating on the same data model so easy enough to test the
alternatives. The important bit is getting the revision-ish things to sort correctly. I think
we can do that by generating something like
> >>>>
> >>>> revision-ish = NotDeleted/1bit : RevPos : RevHash
> >>>>
> >>>> with some suitable order-preserving encoding on the RevPos integer.
> >>>>
> >>>> Apologies for the long email. Happy for any comments, either here or
over on IRC. Cheers,
> >>>>
> >>>> Adam
> >>>>
> >>>>> On Feb 7, 2019, at 4:52 PM, Robert Newson <rnewson@apache.org>
wrote:
> >>>>>
> >>>>> I think we should choose simple. We can then see if performance
is too low or storage overhead too high and then see what we can do about it.
> >>>>>
> >>>>> B.
> >>>>>
> >>>>> --
> >>>>> Robert Samuel Newson
> >>>>> rnewson@apache.org
> >>>>>
> >>>>> On Thu, 7 Feb 2019, at 20:36, Ilya Khlopotov wrote:
> >>>>>> We cannot do simple thing if we want to support sharing of JSON
terms. I
> >>>>>> think if we want the simplest path we should move sharing out
of the
> >>>>>> scope. The problem with sharing is we need to know the location
of
> >>>>>> shared terms when we do write. This means that we have to read
full
> >>>>>> document on every write. There might be tricks to replace full
document
> >>>>>> read with some sort of hierarchical signature or sketch of a
document.
> >>>>>> However these tricks do not fall into simplest solution category.
We
> >>>>>> need to choose the design goals:
> >>>>>> - simple
> >>>>>> - performance
> >>>>>> - reduced storage overhead
> >>>>>>
> >>>>>> best regards,
> >>>>>> iilyak
> >>>>>>
> >>>>>> On 2019/02/07 12:45:34, Garren Smith <garren@apache.org>
wrote:
> >>>>>>> I’m also in favor of keeping it really simple and then
testing and
> >>>>>>> measuring it.
> >>>>>>>
> >>>>>>> What is the best way to measure that we have something that
works? I’m not
> >>>>>>> sure just relying on our current tests will prove that?
Should we define
> >>>>>>> and build some more complex situations e.g docs with lots
of conflicts or
> >>>>>>> docs with wide revisions and make sure we can solve for
those?
> >>>>>>>
> >>>>>>> On Thu, Feb 7, 2019 at 12:33 PM Jan Lehnardt <jan@apache.org>
wrote:
> >>>>>>>
> >>>>>>>> I’m also very much in favour with starting with the
simplest thing that
> >>>>>>>> can possibly work and doesn’t go against the advertised
best practices of
> >>>>>>>> FoundationDB. Let’s get that going and get a feel
for how it all works
> >>>>>>>> together, before trying to optimise things we can’t
measure yet.
> >>>>>>>>
> >>>>>>>> Best
> >>>>>>>> Jan
> >>>>>>>> —
> >>>>>>>>
> >>>>>>>>> On 6. Feb 2019, at 16:58, Robert Samuel Newson <rnewson@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> With the Redwood storage engine under development
and with prefix
> >>>>>>>> elision part of its design, I don’t think we should
get too hung up on
> >>>>>>>> adding complications and indirections in the key space
just yet. We haven’t
> >>>>>>>> written a line of code or run any tests, this is premature
optimisation.
> >>>>>>>>>
> >>>>>>>>> I’d like to focus on the simplest solution that
yields all required
> >>>>>>>> properties. We can embellish later (if warranted).
> >>>>>>>>>
> >>>>>>>>> I am intrigued by all the ideas that might allow
us cheaper inserts and
> >>>>>>>> updates than the current code where there are multiple
edit branches in the
> >>>>>>>> stored document.
> >>>>>>>>>
> >>>>>>>>> B.
> >>>>>>>>>
> >>>>>>>>>> On 6 Feb 2019, at 02:18, Ilya Khlopotov <iilyak@apache.org>
wrote:
> >>>>>>>>>>
> >>>>>>>>>> While reading Adam's proposal I came to realize
that: we don't have to
> >>>>>>>> calculate winning revision at read time.
> >>>>>>>>>> Since FDB's transactions are atomic we can calculate
it when we write.
> >>>>>>>> This means we can just write latest values into separate
range. This makes
> >>>>>>>> lookup of latest version fast.
> >>>>>>>>>> Another realization is if we want to share values
for some json paths
> >>>>>>>> we would have to introduce a level of indirection.
> >>>>>>>>>> Bellow is the data model inspired by Adam's
idea to share json_paths.
> >>>>>>>> In this model the json_path is stored in the revision
where it was first
> >>>>>>>> added (we call that revision an owner of a json_path).
The values for
> >>>>>>>> json_path key can be scalar values, parts of scalar
values or pointers to
> >>>>>>>> owner location.
> >>>>>>>>>> The below snippets are sketches of transactions.
> >>>>>>>>>> The transactions will include updates to other
keys as needed
> >>>>>>>> (`external_size`, `by_seq` and so on).  The revision
tree management is not
> >>>>>>>> covered yet.
> >>>>>>>>>> The `rev -> vsn` indirection is not strictly
required. It is added
> >>>>>>>> because it saves some space since `rev` is a long string
and `vsn` is FDB
> >>>>>>>> versionstamp of fixed size.
> >>>>>>>>>>
> >>>>>>>>>> - `{NS} / {docid} / _by_rev / {rev} = vsn`
> >>>>>>>>>> - `{NS} / {docid} / _used_by / {json_path} /
{another_vsn} = NIL`
> >>>>>>>>>> - `{NS} / {docid} / _data / {json_path} = latest_value
| part`
> >>>>>>>>>> - `{NS} / {docid} / {vsn} / _data / {json_path}
= value | part |
> >>>>>>>> {another_vsn}`
> >>>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>> write(txn, doc_id, prev_rev, json):
> >>>>>>>>>> txn.add_write_conflict_key("{NS} / {doc_id}
/ _rev")
> >>>>>>>>>> rev = generate_new_rev()
> >>>>>>>>>> txn["{NS} / {docid} / _by_rev / {rev}"] = vsn
> >>>>>>>>>> for every json_path in flattened json
> >>>>>>>>>> - {NS} / {docid} / _used_by / {json_path} /
{another_vsn} = NIL
> >>>>>>>>>> if rev is HEAD:
> >>>>>>>>>>   # this range contains values for all json
paths for the latest
> >>>>>>>> revision (read optimization)
> >>>>>>>>>>   - {NS} / {docid} / _data / {json_path} = latest_value
| part
> >>>>>>>>>> - {NS} / {docid} / {vsn} / _data / {json_path}
= value | part |
> >>>>>>>> {another_vsn}
> >>>>>>>>>> txn["{NS} / {doc_id} / _rev"] = rev
> >>>>>>>>>>
> >>>>>>>>>> get_current(txn, doc_id):
> >>>>>>>>>> # there is no sharing of json_paths in this
range (read optimization)
> >>>>>>>>>> txn.get_range("{NS} / {docid} / _data / 0x00",
"{NS} / {docid} / _data
> >>>>>>>> / 0xFF" )
> >>>>>>>>>>
> >>>>>>>>>> get_revision(txn, doc_id, rev):
> >>>>>>>>>> vsn = txn["{NS} / {docid} / _by_rev / {rev}"]
> >>>>>>>>>> json_paths = txn.get_range("{NS} / {vsn} / {docid}
/ _data / 0x00",
> >>>>>>>> "{NS} / {vsn} / {docid} / _data / 0xFF" )
> >>>>>>>>>> for every json_path in json_paths:
> >>>>>>>>>> if value has type vsn:
> >>>>>>>>>>   another_vsn = value
> >>>>>>>>>>      value = txn["{NS} / {docid} / {another_vsn}
/ _data /
> >>>>>>>> {json_path}"]
> >>>>>>>>>> result[json_path] = value
> >>>>>>>>>>
> >>>>>>>>>> delete_revision(txn, doc_id, rev):
> >>>>>>>>>> vsn = txn["{NS} / {docid} / _by_rev / {rev}"]
> >>>>>>>>>> json_paths = txn.get_range("{NS} / {vsn} / {docid}
/ _data / 0x00",
> >>>>>>>> "{NS} / {vsn} / {docid} / _data / 0xFF" )
> >>>>>>>>>> for every json_path in json_paths:
> >>>>>>>>>> if value has type vsn:
> >>>>>>>>>>  # remove reference to deleted revision from
the owner
> >>>>>>>>>>   del txn[{NS} / {docid} / _used_by / {json_path}
/ {vsn}]
> >>>>>>>>>> # check if deleted revision of json_path is
not used by anything else
> >>>>>>>>>> if txn.get_range("{NS} / {docid} / _used_by
/ {json_path} / {vsn}",
> >>>>>>>> limit=1) == []:
> >>>>>>>>>>   del txn["{NS} / {docid} / {vsn} / _data /
{json_path}"]
> >>>>>>>>>> if vsn is HEAD:
> >>>>>>>>>>   copy range for winning revision into "{NS}
/ {docid} / _data /
> >>>>>>>> {json_path}"
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> best regards,
> >>>>>>>>>> iilyak
> >>>>>>>>>>
> >>>>>>>>>> On 2019/02/04 23:22:09, Adam Kocoloski <kocolosk@apache.org>
wrote:
> >>>>>>>>>>> I think it’s fine to start a focused discussion
here as it might help
> >>>>>>>> inform some of the broader debate over in that thread.
> >>>>>>>>>>>
> >>>>>>>>>>> As a reminder, today CouchDB writes the
entire body of each document
> >>>>>>>> revision on disk as a separate blob. Edit conflicts
that have common fields
> >>>>>>>> between them do not share any storage on disk. The revision
tree is encoded
> >>>>>>>> into a compact format and a copy of it is stored directly
in both the by_id
> >>>>>>>> tree and the by_seq tree. Each leaf entry in the revision
tree contain a
> >>>>>>>> pointer to the position of the associated doc revision
on disk.
> >>>>>>>>>>>
> >>>>>>>>>>> As a further reminder, CouchDB 2.x clusters
can generate edit conflict
> >>>>>>>> revisions just from multiple clients concurrently updating
the same
> >>>>>>>> document in a single cluster. This won’t happen when
FoundationDB is
> >>>>>>>> running under the hood, but users who deploy multiple
CouchDB or PouchDB
> >>>>>>>> servers and replicate between them can of course still
produce conflicts
> >>>>>>>> just like they could in CouchDB 1.x, so we need a solution.
> >>>>>>>>>>>
> >>>>>>>>>>> Let’s consider the two sub-topics separately:
1) storage of edit
> >>>>>>>> conflict bodies and 2) revision trees
> >>>>>>>>>>>
> >>>>>>>>>>> ## Edit Conflict Storage
> >>>>>>>>>>>
> >>>>>>>>>>> The simplest possible solution would be
to store each document
> >>>>>>>> revision separately, like we do today. We could store
document bodies with
> >>>>>>>> (“docid”, “revid”) as the key prefix, and each
transaction could clear the
> >>>>>>>> key range associated with the base revision against
which the edit is being
> >>>>>>>> attempted. This would work, but I think we can try to
be a bit more clever
> >>>>>>>> and save on storage space given that we’re splitting
JSON documents into
> >>>>>>>> multiple KV pairs.
> >>>>>>>>>>>
> >>>>>>>>>>> One thought I’d had is to introduce a
special enum Value which
> >>>>>>>> indicates that the subtree “beneath” the given Key
is in conflict. For
> >>>>>>>> example, consider the documents
> >>>>>>>>>>>
> >>>>>>>>>>> {
> >>>>>>>>>>> “_id”: “foo”,
> >>>>>>>>>>> “_rev”: “1-abc”,
> >>>>>>>>>>> “owner”: “alice”,
> >>>>>>>>>>> “active”: true
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>> {
> >>>>>>>>>>> “_id”: “foo”,
> >>>>>>>>>>> “_rev”: “1-def”,
> >>>>>>>>>>> “owner”: “bob”,
> >>>>>>>>>>> “active”: true
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> We could represent these using the following
set of KVs:
> >>>>>>>>>>>
> >>>>>>>>>>> (“foo”, “active”) = true
> >>>>>>>>>>> (“foo”, “owner”) = kCONFLICT
> >>>>>>>>>>> (“foo”, “owner”, “1-abc”) =
“alice”
> >>>>>>>>>>> (“foo”, “owner”, “1-def”) =
“bob”
> >>>>>>>>>>>
> >>>>>>>>>>> This approach also extends to conflicts
where the two versions have
> >>>>>>>> different data types. Consider a more complicated example
where bob dropped
> >>>>>>>> the “active” field and changed the “owner” field
to an object:
> >>>>>>>>>>>
> >>>>>>>>>>> {
> >>>>>>>>>>> “_id”: “foo”,
> >>>>>>>>>>> “_rev”: “1-def”,
> >>>>>>>>>>> “owner”: {
> >>>>>>>>>>> “name”: “bob”,
> >>>>>>>>>>> “email”: “bob@example.com"
> >>>>>>>>>>> }
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> Now the set of KVs for “foo” looks like
this (note that a missing
> >>>>>>>> field needs to be handled explicitly):
> >>>>>>>>>>>
> >>>>>>>>>>> (“foo”, “active”) = kCONFLICT
> >>>>>>>>>>> (“foo”, “active”, “1-abc”) =
true
> >>>>>>>>>>> (“foo”, “active”, “1-def”) =
kMISSING
> >>>>>>>>>>> (“foo”, “owner”) = kCONFLICT
> >>>>>>>>>>> (“foo”, “owner”, “1-abc”) =
“alice”
> >>>>>>>>>>> (“foo”, “owner”, “1-def”, “name”)
= “bob”
> >>>>>>>>>>> (“foo”, “owner”, “1-def”, “email”)
= “bob@example.com”
> >>>>>>>>>>>
> >>>>>>>>>>> I like this approach for the common case
where documents share most of
> >>>>>>>> their data in common but have a conflict in a very specific
field or set of
> >>>>>>>> fields.
> >>>>>>>>>>>
> >>>>>>>>>>> I’ve encountered one important downside,
though: an edit that
> >>>>>>>> replicates in and conflicts with the entire document
can cause a bit of a
> >>>>>>>> data explosion. Consider a case where I have 10 conflicting
versions of a
> >>>>>>>> 100KB document, but the conflicts are all related to
a single scalar value.
> >>>>>>>> Now I replicate in an empty document, and suddenly I
have a kCONFLICT at
> >>>>>>>> the root. In this model I now need to list out every
path of every one of
> >>>>>>>> the 10 existing revisions and I end up with a 1MB update.
Yuck. That’s
> >>>>>>>> technically no worse in the end state than the “zero
sharing” case above,
> >>>>>>>> but one could easily imagine overrunning the transaction
size limit this
> >>>>>>>> way.
> >>>>>>>>>>>
> >>>>>>>>>>> I suspect there’s a smart path out of
this. Maybe the system detects a
> >>>>>>>> “default” value for each field and uses that instead
of writing out the
> >>>>>>>> value for every revision in a conflicted subtree. Worth
some discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> ## Revision Trees
> >>>>>>>>>>>
> >>>>>>>>>>> In CouchDB we currently represent revisions
as a hash history tree;
> >>>>>>>> each revision identifier is derived from the content
of the revision
> >>>>>>>> including the revision identifier of its parent. Individual
edit branches
> >>>>>>>> are bounded in *length* (I believe the default is 1000
entries), but the
> >>>>>>>> number of edit branches is technically unbounded.
> >>>>>>>>>>>
> >>>>>>>>>>> The size limits in FoundationDB preclude
us from storing the entire
> >>>>>>>> key tree as a single value; in pathological situations
the tree could
> >>>>>>>> exceed 100KB. Rather, I think it would make sense to
store each edit
> >>>>>>>> *branch* as a separate KV. We stem the branch long before
it hits the value
> >>>>>>>> size limit, and in the happy case of no edit conflicts
this means we store
> >>>>>>>> the edit history metadata in a single KV. It also means
that we can apply
> >>>>>>>> an interactive edit without retrieving the entire conflicted
revision tree;
> >>>>>>>> we need only retrieve and modify the single branch against
which the edit
> >>>>>>>> is being applied. The downside is that we duplicate
historical revision
> >>>>>>>> identifiers shared by multiple edit branches, but I
think this is a
> >>>>>>>> worthwhile tradeoff.
> >>>>>>>>>>>
> >>>>>>>>>>> I would furthermore try to structure the
keys so that it is possible
> >>>>>>>> to retrieve the “winning” revision in a single limit=1
range query. Ideally
> >>>>>>>> I’d like to proide the following properties:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) a document read does not need to retrieve
the revision tree at all,
> >>>>>>>> just the winning revision identifier (which would be
stored with the rest
> >>>>>>>> of the doc)
> >>>>>>>>>>> 2) a document update only needs to read
the edit branch of the
> >>>>>>>> revision tree against which the update is being applied,
and it can read
> >>>>>>>> that branch immediately knowing only the content of
the edit that is being
> >>>>>>>> attempted (i.e., it does not need to read the current
version of the
> >>>>>>>> document itself).
> >>>>>>>>>>>
> >>>>>>>>>>> So, I’d propose a separate subspace (maybe
“_meta”?) for the revision
> >>>>>>>> trees, with keys and values that look like
> >>>>>>>>>>>
> >>>>>>>>>>> (“_meta”, DocID, IsDeleted, RevPosition,
RevHash) = [ParentRev,
> >>>>>>>> GrandparentRev, …]
> >>>>>>>>>>>
> >>>>>>>>>>> The inclusion of IsDeleted, RevPosition
and RevHash in the key should
> >>>>>>>> be sufficient (with the right encoding) to create a
range query that
> >>>>>>>> automatically selects the “winner” according to
CouchDB’s arcane rules,
> >>>>>>>> which are something like
> >>>>>>>>>>>
> >>>>>>>>>>> 1) deleted=false beats deleted=true
> >>>>>>>>>>> 2) longer paths (i.e. higher RevPosition)
beat shorter ones
> >>>>>>>>>>> 3) RevHashes with larger binary values beat
ones with smaller values
> >>>>>>>>>>>
> >>>>>>>>>>> ===========
> >>>>>>>>>>>
> >>>>>>>>>>> OK, that’s all on this topic from me for
now. I think this is a
> >>>>>>>> particularly exciting area where we start to see the
dividends of splitting
> >>>>>>>> up data into multiple KV pairs in FoundationDB :) Cheers,
> >>>>>>>>>>>
> >>>>>>>>>>> Adam
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On Feb 4, 2019, at 2:41 PM, Robert Newson
<rnewson@apache.org> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> This one is quite tightly coupled to
the other thread on data model,
> >>>>>>>> should we start much conversation here before that one
gets closer to a
> >>>>>>>> solution?
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Robert Samuel Newson
> >>>>>>>>>>>> rnewson@apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, 4 Feb 2019, at 19:25, Ilya Khlopotov
wrote:
> >>>>>>>>>>>>> This is a beginning of a discussion
thread about storage of edit
> >>>>>>>>>>>>> conflicts and everything which relates
to revisions.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Professional Support for Apache CouchDB:
> >>>>>>>> https://neighbourhood.ie/couchdb-support/
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
>

Mime
View raw message