incubator-wave-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Blossom <jblos...@gmail.com>
Subject Re: Delving into OT
Date Mon, 08 Jul 2013 13:43:47 GMT
Ingenious, Torben, certainly adds efficiency. John

On Mon, Jul 1, 2013 at 4:38 AM, Torben Weis <torben.weis@gmail.com> wrote:

> 2013/6/25 Joseph Gentle <josephg@gmail.com>
>
> >
> > >> When peers connect, they send each other missing ops. Figuring out
> > >> which ops are missing can be surprisingly tricky - but we'll figure
> > >> that out later. New ops must be ingested in order, so we always ingest
> > >> an operation after ingesting all of its parents.
> >
> > Just use a Merkle Tree that is at the same time a prefix tree with
> respect
> to the hashes of the ops (explanation below).
> The bandwidth usage is O(1) if both clients are in sync and O(log n) if
> they have one or few different ops and O(n) in the worst case, where n in
> the number of ops.
>
> Constructing the tree is simple.
> Let the hash function output 20 bytes and let's encode this in hex. This
> results in a hash-string of 40 hex-characters for each operation.
> Each node hashes over the hashes of its children. Leaf-nodes correspond to
> operations and thus use the hash value of their respective operation.
> The tree-invariant is that all siblings on level x share the same prefix of
> x hex-characters.
> The tree is not sent over the network. Instead, clients start comparing the
> hashes at the root.
>
> Two clients compare their root hash. If it is equal, the entire tree is
> equal and therefore they are in sync.
> If not, they download all direct children and repeat the procedure for each
> sub-tree rooted by one of these children.
> For example, if child number 3 has a different hash, but all others share
> the same hash, then we have learned that there are one or more ops with a
> hash of 3xxxx... that are different and need syncing.
>
> Typically we can limit the depth of the tree to few levels. 8 levels
> already yield a tree that could store 16^8 possible ops. So in the worst
> case two clients need to wait for 8 round-trips to determine a missing op.
>
> In addition, each client sends a time stamp. So when syncing we report the
> last time stamp received from this client and ask for all ops this client
> received later. If these are few, then simply get them (even if we know
> some of the ops already, because we got them from another client). If there
> are too many ops, fall back to the merkle tree. With a good approximation
> of RTT and bandwidth, it is easy to calculate which algorithm is the best
> to sync two clients.
>
> Greetings
> Torben
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message