incubator-wave-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Blossom <jblos...@gmail.com>
Subject Re: Future of Apache wave [Was: Re: Advantages of P2P messaging?]
Date Fri, 14 Jun 2013 14:57:39 GMT
Michael, Joseph,

I am reading this part of the thread with great interest. It sounds as is
the OT paradigms that you and Joseph are suggesting might support different
types of formats for different messages/blips within a common
document/wavelet. That would seem to be an important objective,
potentially, as it will make it easier for conversations to turn into rich
collaborative environments. You can keep a typical exchange in Wave very
texty and compact, and then punch it out message by message at certain
points in the conversation create deeper engagement. At the same time, I am
wondering out loud about how to provide data models that can enable
multiple apps or applets to access the same documents/messages to provide a
variety of services for those messages. So I am wondering how these OT
models might manage their data models such that a wide variety of apps can
look at a message's construct and not be "locked out" due to data
formatting too specific to an app or applet/gadget.

Thanks,
John

On Fri, Jun 14, 2013 at 4:39 AM, Michael MacFadden <
michael.macfadden@gmail.com> wrote:

> Bruno / Sam,
>
> This is the tricky part.  You can abstract "some" parts of the operation
> but not all.  The whole point of the OT Stack is to adjust the parameters
> of you operations so that they have the same meaning in multiple contexts.
>  We can't abstract away all of those parameters because the transformation
> functions must define rules to transform those parameters.
>
> That said, in an object oriented language you typically have two types of
> entities.  Object Types and Primitive Types.  You may need operations than
> handle both.  For example, the things you may do to an int, may be
> different than what you would do with a "Person" object.  The things you
> do with an Array may be different than that things you do with a Map.
>
> Correctly defining this set of operations is tricky and, as I said, part
> of ongoing research.  But the approach is sound from an idealistic
> standpoint.  Whether it is practical or not is another story.
>
> ~Michael
>
> On 6/13/13 9:15 PM, "Bruno Gonzalez (aka stenyak)" <stenyak@gmail.com>
> wrote:
>
> >I assume the "path" or "index" would be abstracted too. This way, OT can
> >also handle the (x,y) position of a pixel in an image, or any other kind
> >of
> >position or range in which the operation must be applied.
> >
> >
> >On Thu, Jun 13, 2013 at 10:06 PM, Sam Nelson <sohra@orcon.net.nz> wrote:
> >
> >> Hi Michael,
> >>
> >> I'm trying to wrap my head around this too.
> >> Say you have some JSON object:
> >> {
> >>   "i" : 5
> >>   "s" : "string"
> >>   "c" : { "i" : 2 }
> >>   "a" : [ { "i" : 3 } ]
> >> }
> >>
> >> What would the parameters be to delete "s" since a path is really
> >>required
> >> isn't it, rather than an index? (i.e. parameters are specific to the
> >>type
> >> they operate on)  And further, what would a delete operation do in this
> >> case?  remove the "s" member of the object, or just set its value to
> >>null?
> >>  That decision could be application implementation specific, sure, but
> >>if
> >> the application needed both concepts, how can you now define two
> >>abstract
> >> delete operations, in order for the application to implement them both
> >>for
> >> each case?
> >>
> >> -Sam
> >>
> >>
> >>
> >>
> >> On 14/06/2013 07:45, Michael MacFadden wrote:
> >>
> >>> Joseph,
> >>>
> >>> We are almost in sync now.  Lets go one step further.  Let's so you
> >>>were
> >>> designing an application to be a rich text editor.  Forget OT, you just
> >>> making an editor.  I assume your editor has to have some sort of model
> >>> right?  Let's temporarily forget the persistence format.  You may save
> >>>the
> >>> rich text to xml, or rtf, or whatever, but I am not worried about
> >>>that.  I
> >>> am saying what is the in memory model that your editor uses to interact
> >>> with the document?  Build that.  Build it any way you like.
> >>>
> >>> Ok so now you have a rich text object model.  Your editor is going to
> >>> interact with that though some sort of object model API.  When the user
> >>> selects some text and presses the bold button, the editor makes some
> >>>API
> >>> call to the model and says, make this part bold.  For the sake of
> >>> conversation, I don't care how that internally happens in the object
> >>>data
> >>> model.
> >>>
> >>> OK.  So now if we have a sufficiently powerful OT operation set can
> >>> describe manipulating objects, we can manipulate the object model with
> >>>OT.
> >>>   Really what OT services are, are robust message busses that describe
> >>>how
> >>> one user is changing the objects to another user, and accounting for
> >>> context transformations along the way.  So if you can build an
> >>>abstract OT
> >>> operation set that lets you mess with objects and objects structures,
> >>>then
> >>> you have a shot at then adapting that operation set to a whole slew of
> >>> applications.
> >>>
> >>> This is actually an ongoing area of research, that I presented a paper
> >>>on
> >>> to the collaborative editing workshop at the ACM CSCW conference last
> >>> year.
> >>>
> >>> ~Michael
> >>>
> >>> On 6/13/13 8:34 PM, "Joseph Gentle" <josephg@gmail.com> wrote:
> >>>
> >>>  Interesting...
> >>>>
> >>>> The abstraction I use is to have a bunch of data types. Each data type
> >>>> defines what documents look like, what operations look like and they
> >>>> define a set of OT functions (transform, compose, apply, etc). Eg,
> >>>> Text documents are strings and their operations are lists of {skip:5},
> >>>> {insert:'hi'}, {delete:10}, etc. JSON documents are JSON and their
> >>>> operations are lists of path+what to do there. Eg, [{path: ['hi'],
> >>>> delete list element 5}, ...]
> >>>>
> >>>> It sounds like you're saying we should abstract over the ideas of
> >>>> ot-for-lists, ot-for-sets and so on. Is that right?
> >>>>
> >>>> ... But rich text isn't quite a list or a set. You can make annotation
> >>>> markers or something, but then they take up space. Maybe its possible
> >>>> to ignore the final document space that an annotation takes up for the
> >>>> purpose of transformation?
> >>>>
> >>>> Another architecture I've thought about using is making all documents
> >>>> use the JSON OT code. Specialized type like rich text can exist as
> >>>> leaves in the JSON structure - and let you embed a rich text operation
> >>>> inside a JSON operation.
> >>>>
> >>>> -J
> >>>>
> >>>>
> >>>> On Thu, Jun 13, 2013 at 12:05 PM, Michael MacFadden
> >>>> <michael.macfadden@gmail.com> wrote:
> >>>>
> >>>>> As a follow up.  The reason you are struggling with the concept
is
> >>>>>that
> >>>>> you have tied the operation language directly to a specific data
> >>>>>model,
> >>>>> in
> >>>>> much the way wave did.  They created a conversation model and a
> >>>>>specific
> >>>>> set of operations that act on that model.  When you do that your
> >>>>> operations a making assumptions on how the object model works. 
This
> >>>>> coupling is not a good idea.  Much of the OT community strongly
> >>>>> recommends
> >>>>> avoiding this.
> >>>>>
> >>>>> Rather great a generic set of operations that manipulate things
in an
> >>>>> abstract way, and then let the application sort out what to do with
> >>>>>the
> >>>>> operations when it receives it.  The OT stack only needs to
> >>>>>understand
> >>>>> how
> >>>>> the parameters of the operations interact; such as positional
> >>>>>arguments
> >>>>> for insert and delete style operations.  The OT Stack doesn't need
to
> >>>>> know
> >>>>> that the thing you are inserting is a character, a contact card,
a
> >>>>> database record, or an object in a list.  It doesn't care.  It just
> >>>>> knows
> >>>>> that if one insert happens before another it has to increment the
> >>>>>index
> >>>>> of
> >>>>> the second operation.
> >>>>>
> >>>>> If things are decoupled in this way, the whole OT stack becomes
much
> >>>>> more
> >>>>> flexible.  As one of the founders of OT says almost every time I
see
> >>>>> him,
> >>>>> "Let OT focus on what it is good at, and let it ignore everything
> >>>>>else".
> >>>>>
> >>>>> ~Michael
> >>>>>
> >>>>> On 6/13/13 7:54 PM, "Joseph Gentle" <josephg@gmail.com> wrote:
> >>>>>
> >>>>>  So you're imagining storing rich text like this?
> >>>>>>
> >>>>>> {doc: 'hi there!', annotations: [{from:0, to:2, bold:true}]}
or
> >>>>>> something?
> >>>>>>
> >>>>>> Every change to the document is going to need to manually update
> >>>>>>every
> >>>>>> single annotation which has start / end points after the edit.
But
> >>>>>>it
> >>>>>> wouldn't work - if you insert some text and I edit an annotation
> >>>>>>later
> >>>>>> in the document, my annotation will float forwards / backwards
when
> >>>>>>I
> >>>>>> get your op because I don't know how I should change it.
> >>>>>>
> >>>>>> This idea comes up about every 6 months on the sharejs mailing
list.
> >>>>>> Several solutions have been proposed, but none of them work
> >>>>>>correctly.
> >>>>>> I think we just need a separate set of transform / apply / ...
> >>>>>> functions for rich text.
> >>>>>>
> >>>>>> -J
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jun 13, 2013 at 1:19 AM, Michael MacFadden
> >>>>>> <michael.macfadden@gmail.com> wrote:
> >>>>>>
> >>>>>>> Joseph,
> >>>>>>>
> >>>>>>> I disagree.  The annotations themselves are just another
data
> >>>>>>> structure.
> >>>>>>> You add them, remove them and modify them like anything
else.  You
> >>>>>>>can
> >>>>>>> manage annotations as another structure within the blip
model.
> >>>>>>>There
> >>>>>>> is
> >>>>>>> no reason why you can interface them though a JSON Style
operations
> >>>>>>> structure.
> >>>>>>>
> >>>>>>> ~Michael
> >>>>>>>
> >>>>>>> On 6/13/13 12:11 AM, "Joseph Gentle" <josephg@gmail.com>
wrote:
> >>>>>>>
> >>>>>>>  The conversation *model* yes, but not the rich text documents
> >>>>>>>> themselves. You can't really make text annotations work
properly
> >>>>>>>>on
> >>>>>>>> top of JSON operations. We should keep something like
the current
> >>>>>>>> system for actual blips.
> >>>>>>>>
> >>>>>>>> -J
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Jun 12, 2013 at 4:06 PM, Michael MacFadden
> >>>>>>>> <michael.macfadden@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> Actually I just went and took a look at your operations.
 The
> >>>>>>>>>JSON
> >>>>>>>>> OT
> >>>>>>>>> type
> >>>>>>>>> is probably the closest to what I would suggest
we use.  JSON
> >>>>>>>>> Objects
> >>>>>>>>> are
> >>>>>>>>> not just for javascript.  They define arbitrary
objects
> >>>>>>>>>structures.
> >>>>>>>>> We
> >>>>>>>>> don't need a specific wave XML type, we could use
the JSNO
> >>>>>>>>> operations
> >>>>>>>>> to
> >>>>>>>>> modify the conversation model
> >>>>>>>>>
> >>>>>>>>> Potentially.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 6/12/13 10:55 PM, "Joseph Gentle" <josephg@gmail.com>
wrote:
> >>>>>>>>>
> >>>>>>>>>  Really?
> >>>>>>>>>>
> >>>>>>>>>> My method for ShareJS was to simply have a JSON
OT type and a
> >>>>>>>>>> plaintext OT type. I'd like to add a rich text
OT type as well.
> >>>>>>>>>> Then
> >>>>>>>>>> people can just pick which one based on what
kind of data they
> >>>>>>>>>> have.
> >>>>>>>>>>
> >>>>>>>>>> For Wave I'd like to be able to do something
similar - JSON is
> >>>>>>>>>> obviously useful for storing application data.
It'd be nice to
> >>>>>>>>>>have
> >>>>>>>>>> some sort of hybrid for wavelets where we can
put multiple
> >>>>>>>>>> different
> >>>>>>>>>> kinds of data inside a wavelet. One option is
to use a JSON OT
> >>>>>>>>>>type
> >>>>>>>>>> as
> >>>>>>>>>> the root of all wavelets and support subdocuments
at arbitrary
> >>>>>>>>>> paths
> >>>>>>>>>> (so the object could be:
> >>>>>>>>>> {projectName:"ruby on rails", files:[{name:'foo/bar.rb',
...}],
> >>>>>>>>>> documentation:{_type:richtext, _data:"<Rich
text data>"}}
> >>>>>>>>>>
> >>>>>>>>>> Or wavelets could simply each have a type (defaulting
to the
> >>>>>>>>>> current
> >>>>>>>>>> wavey XML type).
> >>>>>>>>>>
> >>>>>>>>>> -J
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 12, 2013 at 2:41 PM, Michael MacFadden
> >>>>>>>>>> <michael.macfadden@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> You have stumbled upon one of the weaknesses
of wave OT.  Best
> >>>>>>>>>>> practices
> >>>>>>>>>>> would say to NOT bind your OT directly to
the data type,
> >>>>>>>>>>>because
> >>>>>>>>>>> then
> >>>>>>>>>>> you
> >>>>>>>>>>> don't have an extendable model. For example
if you have all of
> >>>>>>>>>>> your
> >>>>>>>>>>> operations figured out and validated, and
then you need to
> >>>>>>>>>>>change
> >>>>>>>>>>> your
> >>>>>>>>>>> data model, you have to go back and mess
with your
> >>>>>>>>>>>transformation
> >>>>>>>>>>> functions.  Not good.  Or you have to try
to bend new data
> >>>>>>>>>>>models
> >>>>>>>>>>> in
> >>>>>>>>>>> to
> >>>>>>>>>>> the existing one, also not good.
> >>>>>>>>>>>
> >>>>>>>>>>> Best practice is to create a generic OT
model and operate on
> >>>>>>>>>>>that.
> >>>>>>>>>>> There
> >>>>>>>>>>> is debate as to what the model should be,
but most agree on the
> >>>>>>>>>>> concept.
> >>>>>>>>>>>
> >>>>>>>>>>> For example in wave they tried to create
a map like collection
> >>>>>>>>>>> that
> >>>>>>>>>>> OT
> >>>>>>>>>>> could operate on. Essentially though that
had to implement the
> >>>>>>>>>>>map
> >>>>>>>>>>> as
> >>>>>>>>>>> if
> >>>>>>>>>>> its underlying model was a bunch of XMLish
type tags.  This we
> >>>>>>>>>>> very
> >>>>>>>>>>> convoluted.
> >>>>>>>>>>>
> >>>>>>>>>>> ~Michael
> >>>>>>>>>>>
> >>>>>>>>>>> On 6/12/13 10:26 PM, "Joseph Gentle" <josephg@gmail.com>
> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>  Yeah exactly. The google wave OT code uses
special operations
> >>>>>>>>>>>> that
> >>>>>>>>>>>> can
> >>>>>>>>>>>> understand the XML structure. It doesn't
just edit the
> >>>>>>>>>>>>plaintext.
> >>>>>>>>>>>> Formatting annotations are stored in
a special way -
> >>>>>>>>>>>>operations
> >>>>>>>>>>>> can
> >>>>>>>>>>>> say something like "At position 10 add
bold. At position 20
> >>>>>>>>>>>>stop
> >>>>>>>>>>>> adding bold".
> >>>>>>>>>>>>
> >>>>>>>>>>>> -J
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jun 12, 2013 at 1:56 PM, Bruno
Gonzalez (aka stenyak)
> >>>>>>>>>>>> <stenyak@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I suspected something like that.
I assume it also correctly
> >>>>>>>>>>>>> handles
> >>>>>>>>>>>>> variable-length UTF8 characters,
so it's not necessarily
> >>>>>>>>>>>>>1-byte
> >>>>>>>>>>>>> patches?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This starts to make sense. OT can
only compute conflict-free
> >>>>>>>>>>>>> merges
> >>>>>>>>>>>>> using
> >>>>>>>>>>>>> the "character" primitive (because
that's how Wave was
> >>>>>>>>>>>>> originally
> >>>>>>>>>>>>> designed). As an unfortunate consequence,
you can then only
> >>>>>>>>>>>>> OT-operate
> >>>>>>>>>>>>> on
> >>>>>>>>>>>>> plain text. Otherwise you could
get conflict-free xml text
> >>>>>>>>>>>>>that
> >>>>>>>>>>>>> <loo<ks
> >>>>>>>>>>>>> li<>ke>this>, and that
of course isn't legal xml.
> >>>>>>>>>>>>> But we still want rich text in Google
Wave, therefore all the
> >>>>>>>>>>>>> formatting
> >>>>>>>>>>>>> stuff is stored some place else,
specifically in the blip
> >>>>>>>>>>>>> annotations.
> >>>>>>>>>>>>> The
> >>>>>>>>>>>>> modifications to annotations are
(sometimes) simply derived
> >>>>>>>>>>>>>from
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> transformations that the plain text
suffers after merges?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I suppose there could be other OT
algorithms that don't use a
> >>>>>>>>>>>>> "character"
> >>>>>>>>>>>>> primitive, but rather an "xml tag"
primitive, a json item, a
> >>>>>>>>>>>>> "pixel",
> >>>>>>>>>>>>> or
> >>>>>>>>>>>>> anything else, right?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> (sorry for only contributing with
questions... :-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jun 12, 2013 at 10:27 PM,
Joseph Gentle
> >>>>>>>>>>>>> <josephg@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  On Wed, Jun 12, 2013 at 12:13 PM,
Bruno Gonzalez (aka
> >>>>>>>>>>>>>stenyak)
> >>>>>>>>>>>>>> <stenyak@gmail.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> My assumption was that conflicts
were simply mathematically
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> inevitable
> >>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> DVCSs, that's why your mention
about lack of conflict
> >>>>>>>>>>>>>>>markers
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> sparked my
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> interest... you mention
conflicts like they can be
> >>>>>>>>>>>>>>>optional?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If
> >>>>>>>>>>>>>> so,
> >>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> conflicts "eliminated" by
choosing an arbitrary merging
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> strategy
> >>>>>>>>>>>>>> when
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> conflicts *do* happen (e.g.
"choose the last timestamped
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> patch
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>> lose
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> information on the way,
we don't care"), or can they be
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> prevented
> >>>>>>>>>>>>>> from
> >>>>>>>>>>>>>> ever
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> happening in the first place?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> They're inevitable in patch
based systems because patches
> >>>>>>>>>>>>>> usually
> >>>>>>>>>>>>>> have
> >>>>>>>>>>>>>> a line level granularity. OT
usually uses individual
> >>>>>>>>>>>>>>character
> >>>>>>>>>>>>>> positions. In OT, if two operations
both delete the same
> >>>>>>>>>>>>>> character,
> >>>>>>>>>>>>>> the character gets deleted once.
If two clients insert a
> >>>>>>>>>>>>>> character
> >>>>>>>>>>>>>> at
> >>>>>>>>>>>>>> the same position, one of the
characters will be first in
> >>>>>>>>>>>>>>the
> >>>>>>>>>>>>>> resultant document and one will
be second. Conflict markers
> >>>>>>>>>>>>>> just
> >>>>>>>>>>>>>> aren't necessary.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -J
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  --
> >>>>>>>>>>>>>>> Saludos,
> >>>>>>>>>>>>>>>       Bruno González
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ______________________________**_________________
> >>>>>>>>>>>>>>> Jabber: stenyak AT gmail.com
> >>>>>>>>>>>>>>> http://www.stenyak.com
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Saludos,
> >>>>>>>>>>>>>       Bruno González
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ______________________________**_________________
> >>>>>>>>>>>>> Jabber: stenyak AT gmail.com
> >>>>>>>>>>>>> http://www.stenyak.com
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>>
> >>>
> >>
> >
> >
> >--
> >Saludos,
> >     Bruno González
> >
> >_______________________________________________
> >Jabber: stenyak AT gmail.com
> >http://www.stenyak.com
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message