incubator-wave-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex North <a...@alexn.id.au>
Subject Re: JSON editing and move conflicts
Date Wed, 03 Dec 2014 09:46:06 GMT
<blast from="the-past">

Joseph's earlier comments suggest you already realise this, but Wave
implements map structures as list-of-pairs, so conflicting operations end
up both surviving at the protocol level, with a consistent order reflecting
their server order, so that conflict resolution can be left to the
application. It's a bit obscure because the list-of-pairs is embedded in
the wave doc structure, which isn't immediately obvious as essentially
nested lists. This storage/transport representation doesn't constrain the
conceptual application models built on top of it, where the data may then
be converted to a HashMap, JS object etc via application logic.
Occasionally we used element attributes as a model if last-one-wins was
right for the application.

We thought lots about this for different cases and this reflects a general
rule we discovered: you can't design protocol-level conflict resolution in
a re-useable way to suit very many different applications. Something that
works well for one application will confound another. Adding complexity at
this level pretty much always turned out poorly, and where we needed a tie
break it went in favour of document editing (e.g. cursors after concurrent
insertion). Hence both lack of a map primitive and the lack of a move
operation in favour of delete/insert.

P.S. Joseph, I hear your in Sydney. Look forward to seeing you.

</blast>

On Thu Oct 23 2014 at 6:08:38 AM Joseph Gentle <josephg@gmail.com> wrote:

> On Wed, Oct 22, 2014 at 2:50 AM, Patrick Coleman <patcoleman@google.com>
> wrote:
> >>
> >> - I'm not sure if I want to make this OT type invertable. If we don't,
> >
> > What are the costs of making it invertable? i.e. are there particular
> > reasons not to?
> > I guess for proper invertability, you'd need to copy all the data being
> > moved?
>
> I honestly don't know what the complexity cost is. And I've wavered
> back and forth on whether invertibility is worth it for the past few
> years. The way I'm looking at designing operations at the moment is
> something like this:
>
> Given a doc: {x:5, y:[1,2,3]}:
> Pickup phase: go in key x, delete, skip to key y, go in key 1, pick up
> (id=0)
> Edit phase: go in key y.0. Add 10
> Drop phase: go to key x, insert "hi", go in key z, drop id=0
>
> -> The document will be {x:"hi", y:[11,3], z:2}
>
> If this works out, inverting the operation *should be* really easy -
> we just swap the pickup phase & drop phase, then invert everything in
> the edit phase. We *don't* need a copy of moved data, we just need to
> swap the source and destination. I'm thinking that if you explicitly
> want to overwrite something, you'll need to delete it in the pickup
> phase before you can move something over the top of it. Then the
> semantics of conflicts-on-insert will match the semantics of
> conflict-on-move. But that also means that only deletes would need to
> specify what was deleted to make invertibility work.
>
> I don't know if this will actually work out in practice - I'm writing
> the transform function now and I may end up changing the data model
> somewhat.
>
> I also want a couple more properties:
>
> - I want to be able to embed subtypes for things like rich text - and
> those subtypes aren't necessarily invertible themselves.
> - I want to make a TP2 version of this JSON type, so we can make it
> into a p2p federated wave-like structure at some point. That type will
> not be invertible because of the requirements of tp2 semantics.
>
> So, I'm not sure.
>
> >> But we could implement this much more cleanly if move operations
> >> specify a fallback location for conflicting data to get moved into. If
> >> that fallback location was a list, we can guarantee we won't have any
> >> cascaded conflicts. To resolve the conflict, the user can just move or
> >> delete the conflicted object like normal.
> >>
> >
> > Out of interest, is this something that conflicting 'set' operations
> would
> > want too?
> > e.g. (Set 'c' {v: 1}) & (Set 'c' {v: 2}) seems to have similar loss
> > semantics,
> > should _recovery then be [{set: 'c', data: {v: 1}}]
>
> Yep absolutely.
>
> >> Some observations:
> >> >> - This will almost never actually happen in real life.
> >> >> - Even if we turn the operation will transform into a 'replace', we
> >> >> could add a flag to the operation to mark that it was unintentionally
> >> >> overwritten, and then the process which is making the corresponding
> >> >> changes on disk could see that and do some special
> >> >> application-specific behaviour instead of actually deleting the file.
> >> >
> >> >
> >> > Indeed - your structure could instead be:
> >> > { "name": "librope",
> >> >   "package": {
> >> >     "src": {
> >> >       "rope.c": "somelonguniqueid",
> >> >       "rope.h": "anotherfileid",
> >> >   }},
> >> >   data: {
> >> >     "somelonguniqueid": { "text": "#include <rope.h>\n ..." },
> >> >     "anotherfileid": { "text": "// This is a cool header"  },
> >> >   }}
> >> >
> >> > Then deletion is some separate offline index/cleanup, and resolving
> >> > conflicts is simpler.
> >> > (although 'move' is less useful here as there are no child
> properties).
> >>
> >> Yeah - thats definitely usable. It kind of moves the mess around your
> >> plate a little. Its very similar to the idea of just not allowing
> >> move-to-object operations at all, which would force every
> >> dictionary-like structure to be managed using lists with manual
> >> (custom!) dedup in the case of conflicts. Either way, application
> >> writers would need to do some manual garbage collection.
> >
> >
> > Yep, I was assuming everyone wants manual garbage collection anyway :)
> > I realized though that moving folders messes this up a bit;
> > either the folders are also files (possible, but then every N-deep folder
> > traversal does N id lookups)
> > or they are objects (e.g. "src": {"js": { "file1": id1, "file2": id2,
> > ...}}) in which case you're back to the original problem.
>
> Yeah exactly. I think the merge conflict semantics I suggested above
> are a decent fit for this problem, and will make things a lot easier.
>
> >> Does move distinguish between PUT vs. PATCH?
> >> > I would assume PUT (x: {a: 1})  -> z and PUT (y: {b: 2}) -> z would
> end
> >> up
> >> > in z: {b:2}
> >> > but if the second operation received was a PATCH operation, they
> would be
> >> > merged rather than replaced.
> >> > For file system moves they'd probably be PUT, but their might be
> >> instances
> >> > where a PATCH is useful.
> >>
> >> I was just thinking of move as a PUT operation. When is PATCH useful?
> >> In what use cases do you actually want two object structures to get
> >> merged?
> >>
> >
> > Only thing I could think of was batch file moves - i.e. moving a set of
> > string-id'd files from one directory to another,
> > you'd want to PATCH move {'f1.js': {...}, 'f2.js': {...}} from src ->
> dest
> > but then it's a bit weird, as you'd want PATCH on the top level and PUT
> > underneath, and it'd probably be better off as a list of PUT moves.
> > Otherwise, I couldn't think of one, unless a client wanted to get really
> > abstract and e.g. batch a 'local edits' object which then get 'applied'
> > by a PATCH move onto the persisted object, but that also sounds like
> > something that could be done better in other ways.
>
> Yeah.... I'm just going to leave this out for now. If there is a
> compelling use case, A special OT merge operation would be strictly
> better than manually merging all the keys, but if we can't think of
> any actual use for it, I'll save myself the trouble.
>
> -J
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message