incubator-wave-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Blossom <jblos...@gmail.com>
Subject Re: IRC discussion on P2P waving
Date Fri, 21 Jun 2013 13:06:42 GMT
Bruno,

Thanks, this is an excellent summary. It helps me to get the gist of things
more clearly.

On the P2P latency, I don't think that it would be unacceptable to draw a
line and say that P2P provides limited, non-guaranteed realtime OT or that
it's not realtime OT and more of a syncing mode than a conversation mode.
That would probably be sufficient for what needs to be done, especially
since in some instances P2P-enabled Wave sessions may be using MESH
networks for transport - a key factor in how a lot of experimental
communications services are being deployed in developing nations (not just
the Project Loon concept). In the MESH model, you're likely to have one
node within range of another temporarily, which may sync with it, and then
pass along data to another node when it comes in range of it. That's the
most probable scenario for P2P in many instances, I would think. The other
potential scenario: two people in a remote location, for the sake of
argument two movie script-writers who have holed themselves up in a remote
location to collaborate on a common script. They're on two devices that are
very proximate to one another, so perhaps the latency issues will not be so
severe.

Things to think about, I will look at this more carefully later today.

All the best,

John Blossom

On Fri, Jun 21, 2013 at 8:05 AM, Bruno Gonzalez (aka stenyak) <
stenyak@gmail.com> wrote:

> Following Joseph's "A Very Wavey Plan (P2P!)" thread, a couple of
> discussions have taken place at the irc.freenode.net #wiab channel, all
> related to P2P.
>
> I've taken the liberty to restructure the IRC logs, remove some chitchat,
> and divide it into sub-discussions. Feel free to reply to any part of this
> email to continue a discussion.
>
>
> *Summary of discussions:*
> *====================*
> *1) Underlying protocol for P2P federation*
> Currently XMPP is used. HTTP and raw TCP are two suggested candidates (HTTP
> allowing to much more easily reach restricted networks).
>
> *2) Message/event types needed for P2P federation to work*
> We'd need something similar in concept to certain git operations (git
> clone, git push...). All will be based on hashes (not incremental
> integers).
>
> *3) Routing p2p messages/events in a server-aided network*
> One option is to somehow detect server clusters, send data to one of them,
> and let the rest of the cluster servers synchronize to it (locally).
> Alternatively, the originator server can naively send stuff to all possible
> destination servers, regardless of the cost.
>
> *4) Routing p2p messages/events in a pure P2P system (5 parts)*
> How to manage to route all wave-stuff if we want to completely get rid of
> servers completely, and only use peers.
> The closest way would be to use a DHT, but huge latency is an unsolved
> problem, and makes it impossible to use for real-time waving.
> No other solution has been proposed.
>
> *5) Implementing "undo": invertibility, tombstones, edge cases, TP2*
> No server means no canonical order of commits, which means that undo is
> hard to do correctly.
> (uhm... not sure if that's a good summary, some stuff went over my head
> :-D, please read the log instead)
>
> *6) Usability of a pure p2p system in Real Life (tm)*
> Being pragmatic, pure P2P is probably only usable in peers with good
> connectivity. Rest of peers will need to rely on a server/proxy that *does*
> have good connectivity.
>
> *7) Comparison with BitTorrent and P2P-TV technologies*
> Both technologies are much less restricted than wave with regards to
> real-time responsiveness. So none are really a good reference for our
> purposes.
>
> *8) Identifying participants (3 parts)*
> Pure p2p means many peers don't have a name@centralized-server.com user
> handle, so an alternative has to be used.
> However, it's easy to provide a traditional friendly handle, if the user
> prefers the tradeoff of having to often rely on a permanent server. This
> tradeoff can be mitigated by using a sort of userhandle cache.
>
> *9) P2P anonymity (lurking in a wave) (2 parts)*
> In a pure p2p wave network, anonymous peers may want to read a public wave,
> without other peers knowing. A solution could be to make private the
> required wavelets (where the anonymous participants IDs are stored).
>
> *10) Encryption of waves*
> It's been proposed to use an AES key to encrypt all the wave data, and only
> allow participants to decrypt it.
>
> *11) Addition and removal of participants, and their ability to read past
> and future wave versions/deltas*
> The aforementioned AES key can change over time, allowing a finer-grained
> restriction of what deltas new/removed participants can read.
>
>
> *
> *
>
> *Actual conversations:*
> *====================*
> *
> *
> *1) Underlying protocol for P2P federation:*
> [in response to Joseph's email]
> [23:42] <alown> I [...] agree with option 2 (make every root a JSON blob)
> [23:43] <alown> You haven't really detailed (at all) how the P2P federation
> is actually going to work (beyond 'not like IRC')
> [23:44] <josephg> Personally, I'd love some raw TCP action
> [23:44] <alown> I agree using KISS principle.
> [23:44] <josephg> a few years ago (not long after wave was cancelled) there
> was a 'wave summit'
> [23:45] <josephg> - and a few of us chatted about how we could make the
> federation protocol simpler
> [23:45] <josephg> we ended up (somehow) deciding that doing it over http
> woul dbe a good idea
> [23:45] <josephg> because then we could sneak it into companies past their
> corporate HTTP firewalls, etc
> [23:45] <josephg> but in any case, I'd like to figure out the protocol and
> (at least) have a TCP version
> [23:46] <josephg> it should be pretty easy to wrap the same messages in
> websockets if we want
>
>
> *2) **Message/event types needed for P2P federation to work:*
> [23:46] <alown> Do we need anything more complicated than the
> waveletSubmit/Commit messages used currently?
> [23:46] <alown> (Replace wavelet with 'abstract p2p ot container name)
> [23:46] <josephg> um, yeah.
> [23:47] <josephg> we'll also be able to rip out all the code that deals
> with managing the tree of servers per wave
> [23:47] <josephg> but yeah - the protocol will get a bit more complicated
> [23:47] <josephg> ... because we'll lose our beautiful integer version
> numbers
> [23:47] <josephg> so we'll need a protocol for syncronizing ops
> [23:48] <josephg> yeah - ops will each have a hash
> [23:48] <josephg> and two servers could each have ops the other server
> doesn't have
> [23:48] <josephg> so we have to be able to deal with that
> [23:47] <alown> What other 'events' are cared about by any particular
> server?
> [23:47] <alown> For a SHA hash?
> [23:48] <josephg> -> we'll need something like git's sync protocol
> [23:48] <alown> So, initial server contact is 'git clone', and then some
> form of 'git push' on changes?
> [23:49] <josephg> yep.
> [23:49] <josephg> push on changes is easy - its basically the same thing we
> have now
> [23:49] <josephg> just instead of saying "This should be applied at version
> 10" we say "This op has parents [abc123, def456]"
>
>
> *3) **Routing **p2p **messages/events in a server-aided network:*
> [23:49] <alown> With P2P do we have to broadcast to all peers? How do we
> coordinate that between them?
> [23:50] <josephg> between servers? I dunno.
> [23:50] <alown> How does BT handle this?
> [23:50] <josephg> should we just connect every server to every other
> server? That'd work fine...
> [23:50] <josephg> I guess every server can address every other server
> [23:50] <josephg> beacuse the wave will have alown@a.com and
> josephg@b.comand so on on it
> [23:50] <alown> This feels very inefficent...
> [23:51] <josephg> so if you submit an op to your server, your server can go
> "Oh, I need to tell b.com about this too"
> [23:51] <josephg> well, if there's 10 servers, presumably all 10 servers
> need to find out about ops somehow.
> [23:51] <josephg> - assuming we stick with the current model of having
> servers store all your operations
> [23:51] <josephg> .. and documents for all the users at their domain
> [23:51] <alown> But server 'b' and 'c' might both be part of a wave, but
> also know each other, and know that they are 'closer' to each other than
> 'a' is. So, we would want a->b/c then b<->c
> [23:52] <josephg> so actually, having the server which originates an
> operation send it to all the other servers on that wave is actually close
> to ideal.
> [23:52] <josephg> yeah maybe.
>
>
> *4) **Routing **p2p **messages/events in a pure P2P system (part 1):*
> [23:54] <alown> BT uses DHT for its P2P stuff...
> [23:54] <josephg> ...I guess we could use a DHT storing all the ops, but
> thats pretty slow
> [23:55] <josephg> and you still need to notify all servers with users on
> the wave that the wave was updated.
> [23:55] <alown> Maybe, or perhaps only notify those within a certain
> 'distance', with each server doing that. (Though could mean some servers
> are never updated)
> [23:58] <alown> Perhaps we could make the network setup 'SuperWaves' which
> broadcast to all peers, and carry all information, but normal wave servers
> do not reach this status?
> [23:58] <alown> By having it decide itself based on how 'connected' a
> server is, this could find the most efficent ways to route it.
> [00:01] <josephg> Do you think it'll really be a problem?
> [00:01] <josephg> I mean, thinking about it - how many servers will be on a
> given wave?
> [00:01] <alown> Depends.
> [00:01] <alown> No idea.
> [00:01] <josephg> If it were a public wave, I can imagine clients just
> connecting to one (or more) centralized servers
> [00:01] * josephg nods
> [00:02] <josephg> ... But say if we were having a conversation on
> wave-dev@apache, there's like, at most 20 people in a discussion from 5 or
> so domains
> [00:03] <josephg> ... I think we can deal with that kind of load.
> [00:04] <josephg> but if the protocol lets any server tell any other server
> about an operation, then it should be pretty easy to set up something like
> that.
> [00:04] <josephg> maybe.
> [00:04] * josephg thinks
> [00:05] <josephg> hm - you're right. I think I've just gotten used to the
> crappy state of doing routing for broadcasting messages to a network
> [00:05] <josephg> if you can find / think of a better solution, I'm in.
> [00:12] <alown> Heh, anyway replacing the network layer code SHOULD be
> easy, since it SHOULD be cleanly seperated.
> [00:13] <alown> Getting an initial implementation up using broadcast is
> fine.
> [00:13] <alown> (I was thinking of Wave's use in other apps as a reason you
> could have a lot of different participant domains)
> *...4) Routing **p2p **messages/events in a pure P2P system (part 2):*
> [08:53] <stenyak> as for the "how to *really* do p2p", i see two options:
> a) use a dht-like algorithm and/or b) use a helper server to route stuff
> for you
> [08:54] <stenyak> a) can be pretty slow if you want all OPs to reach all
> peers (if I'm not mistaken)
> [08:54] <stenyak> and b) is essentially makes it not-p2p
> [08:55] <stenyak> additionally, using p2p, how are we going to deal with
> routing problems (such as firewalls on both sides, etc)?
> [08:56] <stenyak> in my mind, the only universal solution is to have a
> third party server available to go through if we want speed or if we want
> to work on all edge cases
> [08:56] <stenyak> and wave being advertised as realtime, i don't see how
> something like dht can ever fly
> [11:20] <alown> stenyak: This is why I was wondering about a DHT system
> with 'Superwave' servers (to act as a first point of contact).
> [11:59] <stenyak> that would be like skype dynamic supernode list?
> [11:59] <alown> The original system, yes.
> [12:02] <stenyak> so we would devise a method to identify candidates to
> being a supernode, in order to prevent cellphone wave peers from becoming
> one, and in order to promot certain other nodes (like major peers that have
> 99% uptime, e.g. wave.google.com or whatever)  to become one
> [12:03] <stenyak> bandwidth, latency, open ports, uptime...
> [12:04] <alown> Once a network has been bootstrapped using something, it is
> relatively easy to identify the hosts which are most densely connected (and
> would be good supernode candidates)
> [12:05] <stenyak> what do you mean with "using something"?
> [12:06] <alown> Somehow the network has to initially be able to make
> contact with other nodes (before it knows anything about them)
> [12:07] <alown> For a LAN you could get away with a broadcast 'announce',
> but it is a bit less clear on an internet-sized scale.
> [12:08] <stenyak> bittorrent sync uses a broadcast for LAN. for internet it
> uses a tracker server for fast discovery of peers, or you can disable that
> and force to use DHT (with the long wait that means)
> [12:09] <stenyak> the tracker can also act as a meeting-point for
> firewalled peer pairs (which in my experience is a lot of them)
> [12:09] <alown> Precisely the problem, because we don't really want long
> waits or trackers.
> *...4) Routing **p2p **messages/events in a pure P2P system (part 3):*
> [12:42] <stenyak> hmmm... i'm not sure how a peer gets a list of waves in
> which he's a participant of
> [12:43] <alown> Having a canonical source makes it all so much easier. :P
> [12:44] <stenyak> for pure p2p peers to "receive" new waves, either the
> FROM or the TO peer (or both) would need to try to find their way to the
> other
> [12:44] <stenyak> and we're assumign here that each person only runs one
> peer
> [12:45] <stenyak> e.g. my privatekey may be used by 5 wave peers at the
> same time, and we must make sure the new wave reaches all of them
> [12:46] <alown> Looks like we may need to have mulitple DHTs then (one for
> ops, one for waves)
> [12:46] <stenyak> in BT, it's the receiver end who actively looks for peers
> to receive from. in wave, it's not like that..
> [12:46] <alown> Or could we have a pubkey->wave mapping in one?
> [12:46] <stenyak> and in BT, you can assume *many* people has the data you
> want
> [12:46] <stenyak> in wave, its possible and probably that only one other
> peer in the universe has the wave
> [12:46] <stenyak> (because it's a personal wave sent to you)
> [12:47] <alown> I would expect any long-running supernodes to be implicitly
> part of all waves they know about.
> [12:47] <alown> Though on second thought, this seems like it would add its
> own problems to authentication, storage, promotion of supernodes etc.
> *...4) Routing **p2p **messages/events in a pure P2P system (part 4):*
> [12:51] <alown> Does it make sense for a peer to have your privkey, since
> you could be logged in anywhere, so it would be down to the place you are
> logged in, to 'subscribe' to that wave on the network, and attempt to
> retrieve all data from it...
> [12:55] <alown> I was expecting the network as a whole to act like a
> WaveBus pubsub system, whereby once 'logged in' at some server (which means
> it gets your privkey from the authentication system), that server then
> 'subscribes' to your waves on the 'network'. If somebody else at some other
> server changes it, then that server would be announcing to the network of a
> change (doesn't necesserily have to be a broadcast), which your server
> would 'hear'.
> [12:56] <alown> You could do this from any server where you logged in
> (hence the concept of a domain is lost).
> [12:57] <stenyak> by "server" you mean supernodes?
> [12:57] <alown> Not necessarily.
> [12:59] <stenyak> this pubsub network must be aware of nodes that are in
> it, in order to directly route wave updates to them, correct?
> [12:59] <stenyak> and also, this network wouldn't be very volatile, but
> would rather ideally be long-lived peers?
> [13:00] <alown> It has no reason to have to directly route updates, (though
> it would hopefully be able to identify the best routes automatically).
> [13:00] <alown> Yes it would require a few long-lived peers (which would be
> part of the requirement to be a supernode).
> [13:01] <stenyak> so let's say i connect my laptop wave peer to the
> "server" in the living room, at my firewalled home. this "server" would be
> already subscribed to the pubsub network, and in this specific case it
> would route all wave updates to me
> [13:02] <stenyak> in other cases (let's say, ipv6-enabled nodes everywhere,
> no firewall at home), the living room server could simply notify the
> original "FROM" peer to send stuff to my laptop ipv6 ip, right?
> [13:03] <alown> That sounds right. Supernodes are really only needed for
> getting the routing right.
> [13:05] <stenyak> ok. in both these theoretical cases, the "server" hasn't
> necessarily been a wave node per se (nor a supernode either), but rather a
> second type of wave node that helps get stuff quickly wherever it's needed
> [13:05] <alown> Yes.
> [13:05] <alown> I am not even sure where OT should be happening in this
> picture...
> [13:05] <stenyak> if OT happens, the "server" is a blind proxy i think
> [13:06] <stenyak> so does not need the privkey to work
> [13:07] <stenyak> unless we're also using OT in the wavebus pubusb network
> for some reason?
> [13:07] <alown> Supernodes can be blind (though they might also just be
> normal well-connected wave servers). I would expect normal servers to still
> be doing OT. The question is whether the 'client' (whatever that means)
> should be doing it also.
> [13:08] <alown> The network shouldn't need OT. (Algorithms exist that allow
> the incoming ops to be arbitarily queued and only processed when needed).
> [...]
> [21:21] <josephg> alown: the client always needs to do OT because otherwise
> they can't both edit a document live and receive operations from people who
> didn't have their ops.
> [21:22] <josephg> the server doesn't need to do OT, although if it doesn't
> do OT, it'll punt the OT work to its clients - which will result in a
> higher CPU utilization on mobile devices.
> [...]
> [13:08] <stenyak> i pictured this "server" as being an optional item that
> shortcuts the long waits of DHT, rather than something necessary for
> "clients"?
> [13:08] <alown> Hmm.
> [13:08] <alown> I suppose we should define what a 'client' is then...
> [13:09] <alown> We have at least 2 layers of stuff going on here: 1) Wave
> OT/operation layer 2) Network routing/P2P layer
> [13:13] <alown> But it is quite plausible something might be doing both of
> those
> [13:10] <stenyak> with your pubsub net suggestion, i was picturing 2 kinds:
> a regular pure p2p peer, and a helper kind of node to route stuff quickly
> when a peer is connected to it
> [13:13] <stenyak> so with that picture in mind, layer 1 stuff could go
> directly from peer to peer (if connectivity/firewalls allows), or through
> the "helper node" if available
> [...]
> [13:20] <stenyak> [...] all this discussion looks very similar to
> discussing how to design internet+dns, i think the problems are the same
> really
> [13:20] <stenyak> or at least we could take some inspiration from it maybe
> [13:20] <alown> This was my conclusion last night with josephg. ('The
> problmes should already be solved (see The Internet)')
> [14:09] <stenyak> and The Internets solved the problem how? By having a
> large set of supernodes (dns servers), that may take a whole day to
> propagate updates. The alternative being having the actual IP address in
> the first place, or to centralize stuff
> [14:10] <stenyak> (aka use servers everywhere)
> [14:22] <alown> Maybe, but the internet's design is X (where X > 20) years
> old, so may not represent the most modern thinking of how to make
> distributed networks.
> [14:59] <alown> (Don't forget that our aim for Wave is at the cutting-edge
> of academic research also).
> [...]
> [14:50] <stenyak> i just threw the question at some friends who should be
> more up-to-date with networking technologies than me... hopefully they
> comeback with some revolutionary dns-2 design or something that we can copy
> [15:18] <stenyak> could give as some ideas: http://openpeer.org/
> [15:18] <stenyak> (it's not a solution, but maybe they did the same
> reasoning we're going through)
> [15:46] <stenyak> another response i got goes along the lines of... hard as
> fuck, but if you manage to do it, you are a hero
> [...]
> [15:02] <stenyak> looking at it from a wider perspective, what we want is
> similar to having each peer shout at the whole world "here i am, anything
> got something for meeee?" in some way that doesn't clog the internet tubes,
> and that is so fast as shouting would be. i start to think it's not
> physically possible to do that...
> [15:03] <stenyak> if publickeys were handed to people based on the
> location, then we could have routing tables similar to how internet
> currently works
> [15:03] <stenyak> but pubkeys are... well, random. so that kind of routing
> that allows anyone to connect to an arbitrary IP in a matter of
> milliseconds is impossible, i believe
> [15:04] <alown> So, we end up with DNS for public keys?
> [15:04] <stenyak> something like dns, but much faster [wrt. propagation
> times]
> [15:05] <stenyak> so in essence, a tree of servers or whatever (which is
> similar to how wave currently works, right?)
> [15:05] <alown> Heh. But the whole point was to avoid the tree system
> currently (since it is susceptible to netsplits)
> [...]
> [15:56] <stenyak> maybe the real question could be: how do we make DHT much
> faster?
> [16:14] <stenyak> once the initial discovery process is finished, the
> transmission of data will not have the lag associated with DHT, so even if
> DHT takes 10 seconds, that could be acceptable
> [16:15] <stenyak> i.e. a new peer takes 10 seconds to be discovered by the
> rest of participants collaborating in a wave
> [16:16] <stenyak> (or viceversa.. the new peer takes 10 seconds to discover
> the participants)
> [...]
> [16:25] <stenyak> this could shed some light:
>
> http://en.wikipedia.org/wiki/Distributed_hash_table#Algorithms_for_overlay_networks
> [19:06] <stenyak> http://dsn.tm.kit.edu/english/2936.php
> *...4) Routing **p2p **messages/events in a pure P2P system (part 5):*
> [21:03] <josephg> [...] For now, I want wave to be p2p in the same way that
> git is p2p.
> [21:04] <josephg> that is, I want the core algorithms & data structures to
> use P2P-capable algorithms, and probably the wave servers will do p2p
> between themselves (this is easy because they'll all be both named and
> accessable)
> [21:06] <josephg> as for client-to-client p2p, there's a few options
> depending on what kind of use cases we want to support - but I want to
> worry about getting the algorithms p2p-capable first. If you're keen to set
> up an anonymous, distributed wave system over a DHT - well, I want to first
> make that possible
> [21:15] <josephg> .... and as for ipv6, network admins _love_ NAT now that
> we have it
>
>
> *5) Implementing "undo": invertibility, tombstones, edge cases, TP2:*
> [00:17] <alown> I am not sure how an 'undo stack' is going to work (at all)
> with federation...
> [00:18] <josephg> well, you just do undo at the application level
> [00:19] <josephg> "submit op which inserts text" ... later "submit op which
> removes text"
> [00:19] <josephg> you don't need OT for that.
> [00:20] <josephg> I imagine like, a semantic undo. In the client you can
> imagine making an undo op (which might not necessarily rollback an
> operation (because of tombstones and all that))
> [00:20] <josephg> ... but would seem that way as far as the user is
> concerned
> [00:21] <josephg> then if the user hits ctrl+z, you can transform that
> operation up to the current version and apply it
> [00:21] <josephg> - the fact that its an undo isn't really relevant.
> [00:21] <josephg> the bad thing about losing invertability is doing
> playback
> [00:21] <josephg> - because you can't scrub back through time
> [00:21] <alown> But you have all the operations since the start, so you can
> play forward at least?
> [00:23] <josephg> yeah exactly.
> [00:23] <josephg> ... and make like, keyframes of the document
> [00:23] <josephg> - and play forward from them or something.
> [00:23] <alown> Hmm, so you can do the step-back without recalculating the
> entire document?
> [00:24] <alown> I don't really like the idea of then having another
> datastructure to have to pass around...
> [00:24] <josephg> right - if you have a snapshot at version 1000, and the
> user is looking at 1010 and they try to step back to 1009, you can just
> replay ops 1001-1009 on that version 1000 snapshot
> [00:24] <alown> What was the problem with invertible operations (I don't
> understand OT enough yet to be able to properly comment on that side).
> [00:25] <alown> (Other than it confuses people?)
> [00:25] <josephg> hahaha actually people seem to love invertability
> [00:25] <josephg> I don't know why.
> [00:25] <josephg> I've been trying to remove it from sharejs, and everyone
> gets sad.
> [00:26] <josephg> the problem is that if I make an op which deletes the
> whole document (version 100, say) then I undo that operation
> [00:26] <josephg> and you insert in the middle of the document at version
> 100, then your op gets transformed to do that insert at the start of the
> document instead at version 101 (because the content has disappeared)
> [00:26] <josephg> and it never goes back to the middle of the document.
> [00:27] <josephg> so, with tombstones you can get around that by having a
> 'resurrect' operation
> [00:27] <josephg> (so deleting the whole document turns the whole document
> into tombstones, then we can resurrect them all again in the inverse)
> [00:28] <josephg> but you can't invert an insert - because deleting leaves
> the tombstone there
> [00:28] <josephg> and if you have a 'real delete' operation, then yeah,
> you're back in the hole
> [00:28] <josephg> also, with wave in particular, inverting is really
> complicated
> [00:29] <josephg> - see, if the wave says "<annotation bold:true>blah
> blah<annotation bold:false> not bolded"
> [00:29] <josephg> then if you insert at the end of the "blah blah", it'll
> automatically get bolded.
> [00:30] <josephg> ... so if the text isn't bolded, and then you bold it
> while I insert at the end of the text, you need to make sure my text
> _isn't_ bolded or something
> [00:31] <josephg> .... and yeah, I can't remember - but there's these
> horror cases that I remember kept me from sleeping when I tried to
> reimplement wave's OT code in C
> [00:31] <alown> hmm
> [00:31] <josephg> and it would have been fine if it wasn't invertible.
> Well, at least it would have been tollerable.
> [00:33] <josephg> So yeah. Conclusion: You can make invertability work, but
> its kind of a bitch, and you can't make it work for TP2
> [00:33] <josephg> which means it won't work if we're federating
> [00:33] <alown> How are we hacking around that currently then?
> [00:33] <josephg> well, we don't do TP2
> [00:34] <josephg> remember, federation just uses a bad version of the
> current client-server protocol
> [00:34] <josephg> - arranged in a tree of servers
> [00:34] * alown goes and looks up which one TP2 was again
> [00:35] <josephg> ... its the one that says you don't need a canonical
> ordering of operations
> [00:35] <josephg> sharejs and wave both use the server to pick the order of
> operations (based on which order they reach the server)
> [00:35] <josephg> and then they use incrementing version numbers based on
> that order
> [00:35] <alown> ah yep.
> [00:35] <josephg> -> for p2p, that doesn't work because you don't have a
> centralized server, and anyone can send messages to anyone
> [00:36] <josephg> and yeah, you need TP2 for that (which sort of says you
> can apply ops from 3 different sites in any order and it still works)
> [00:37] <josephg> - and apparently someone proved that if you make it work
> for 3 sites, it works for any number of sites
> [00:43] <alown> Anyhow, I can see leaving inversion out for simplicity, but
> don't yet understand why it can't be made to work with TP2.
> [00:59] <alown> Hmm. Seen 'A Sequence Transformation Algorithm for
> Supporting Cooperative work on Mobile Devices'?
> [01:02] <josephg>
>
> http://research.microsoft.com/en-us/um/redmond/groups/connect/cscw_10/docs/p159.pdf
> ?
> [01:15] <alown> The main feature is its use of storing local/remote
> operations and processing them much later than receipt time.
> [01:17] <alown> ABT satisfies TP1+2, so looks like this should(?)
> [01:19] <josephg> need to read it
> [01:19] <josephg> ... I'll go through it later
>
>
> *6) Usability of a pure p2p system in Real Life (tm):*
> [12:13] <alown> We also don't know if storing ops in a DHT is efficent
> enough for our use case...
> [12:14] <stenyak> in any case, let's say i fire up my wavep2p android
> client and want to check for any new waves
> [12:14] <stenyak> i definitely won't put up with a wait of 30 seconds when
> i have "this damn fast 4g connection!" in my cellphone
> [12:14] <stenyak> i mean, that's the point of view of six pack joe
> [12:14] <stenyak> and joe is definitely right..
> [12:15] * alown thinks of the hours it took to download the bitcoin
> blockchain from the p2p system
> [12:15] <stenyak> or browse through freenet, or whatever... its painly slow
> [12:16] <stenyak> in the end, i think that most users won't be running a
> full blown peer, but will be relying on an external server instead
> [12:16] <stenyak> i.e. nobody runs their own email servers nowadays
> [12:16] <stenyak> and the same can happen with wave
> [12:16] <alown> Should a mobile client be doing the full p2p federation, or
> simply talking to a server which does it...
> [12:16] <stenyak> the few who decide to run a full-blown wave peer, should
> be aware of the problems
> [12:17] <alown> So, this should be less of a problem since the only nodes
> doing p2p will be proper full-time connected servers?
> [12:17] <stenyak> the thing is, we can assume most people wont fire up
> their own xmpp server, but go for jabber.org account
> [12:17] <stenyak> and the same thing will presumably happen for wave,
> simply because it's easier to do
> [12:18] <stenyak> which doesn't pervent me from running my own full-blown
> wave server
> [12:18] <stenyak> but that's a use case in which the user knows the
> limitations
> [12:19] <stenyak> [...] you and i will run several full-blown wave peers at
> home, at our parent's house, or whatever, but we'll know and accept the
> problems
> [12:19] <stenyak> i think that's the way to think about the problem
> [12:19] <stenyak> heck, most people use github for permanent [git]
> connectivity ;-)
> [12:19] <stenyak> instead of opening ports to their laptop in their lan
> [12:19] <stenyak> and those are the tech-savvy people...
> [12:20] <alown> So, we have a p2p system between wave servers and superwave
> servers, with clients connecting to the server rather than doing the p2p
> itself...
> [12:20] <stenyak> i'm not saying it's the way we should do it. i'm saying
> that's the way it most probably will pan out, because it's already
> hapennign in 100% of the existing p2p protocols i know of
> [12:20] <alown> Hmm...
> [12:21] <stenyak> so we should plan for that instead of a theoretical pure
> p2p world
> [12:21] <stenyak> if we assume there's servers like github, bitbucket and
> sourceforge, then suddently most of the problems go away, while still not
> preventing from people to run fully p2p if they want
>
>
> *7) Comparison with BitTorrent and P2P-TV technologies:*
> [12:21] <alown> BT doesn't have huge servers (and with magnet has actually
> move in the opposite direction).
> [12:21] <stenyak> BT has no real-time needs
> [12:22] <stenyak> that's why they can afford DHT
> [12:22] <stenyak> dht could be used for simulating a forum-like discussion
> in wave. but we can't force that restriction from the server
> [12:22] <stenyak> (i say forum-like, because people don't expect reaction
> within seconds there)
> [12:23] <alown> How did iplayer do its live p2p broadcastinºg?
> [12:23] * stenyak googles what iplayer is
> [12:23] <alown> Sorry, BBC iPlayer is their TV-over-the-internet system.
> [12:24] <alown> Originally it used a p2p system, but got lots of negative
> press (because of assosciation with BT since it used p2p), so it now uses a
> centralized system instead. (And their bandwidth costs are much higher).
> [...]
> [12:25] <stenyak> i seem to recall other [p2p] tv clients
> [12:25] <stenyak>
>
> http://wiki.xbmc.org/index.php?title=HOW-TO:Play_free_P2P_(peer-to-peer)_online_streaming_TV
> [...]
> [12:26] <alown> Found a paper titled "RT-P2P: A Scalable Real-Time
> Peer-to-Peer System with Probabilistic Timing Assurances" (google for it)
> [12:28] <alown> Lookt at the paper I mentioned. It relies on 'super nodes'
> to enable it to keep low latencies...
> [...]
> [12:27] <stenyak> but i'd be wary of using this (p2p tv) as an inspiration.
> i know there's delay of 10-30 seconds from my TV Formula1 image to the
> telemetry that comes through HTTP from formula1.com website. this is
> regular TV, and they don't care about 30 seconds of lag
> [12:27] <stenyak> the only real problem of p2p tv is avoiding much jitter
> [12:27] <stenyak> as long as the stream arrives and is viewable, a delay of
> a minute doesn't matter that much
> [12:28] <alown> True.
>
>
> *8) Identifying participants (part 1):*
> [12:09] <alown> I am also no longer sure what an 'account' should look
> like, since it has no reason to be stuck to a domain...
> [12:10] <stenyak> current wave discovery works by using the domain name of
> the email-address-like list of participants
> [12:10] <stenyak> but here we're talking about hashes, public keys or
> whatever
> [12:10] <stenyak> which do not (necessarily) point to an particular IP:PORT
> or whatever
> [12:10] <alown> Exactly the problem...
> *...8) Identifying participants (part 2):*
> [12:33] <stenyak> would it make sense that, while some participants are
> identified by a pubkey (or whatever), many of them could be identified by a
> user@domain address, with which any peer can quickly locate supernodes?
> [12:33] <stenyak> i mean some kind of dual "pubkey and optional domain
> email-like addr" for the participants list
> [12:34] <stenyak> the optional part being essential in the broader internet
> [12:34] <alown> Isn't that exactly what using Mozilla Persona would do (map
> user@domain to some public-key we can use)
> [12:34] <alown> Removing the need for us to have to roll yet-another
> authentication system.
> [...]
> [12:38] <stenyak> the idea would be that, for a person to be a participant
> in a wave, you *require* his pubkey. optionally, you may have acquired ths
> pubkey by asking "wave.google.com" about the user "joe", getting his
> pubkey
> as a result.
> [12:39] <stenyak> and now that you have the pubkey and one of many possible
> email-like addresses (in this case joe@wave.google.com), then you can use
> the email-like address for displaying in the UI
> [12:39] <stenyak> this means that, whoever wants to run pure p2p peers,
> will have to give his pubkey
> [12:39] <stenyak> and whoever uses the more traditional style, can simply
> give his email-like addr
> [12:39] <stenyak> and the participants list will show a simple email-like
> address most of the time
> [12:40] <alown> Do we then allow anyone to 'log in' to any wave server
> running at any domain, since it should no-longer make any difference where
> they are in the network...
> [12:41] <stenyak> yes, that's needed for world-wide-public waves, which is
> equivalent to a read-only forum on the net
> [12:41] <stenyak> then there could be server-public waves, which is
> equivalent to requiring sign-in to view a forum (and coincidentally the
> current implementation of public waves in WiaB, right?)
> [12:43] * alown has never tested what happens with public waves in the
> current federation system
> *...8) Identifying participants (part 3):
> *
> [21:35] <josephg> - Who is a user? If a user is stenyak@example.com, then
> we can put a server at example.com and it can hold operations for you
> [21:36] <josephg> ie, if I add you to a wave, my computer (or my wave
> server or something) can send a message to example.com to say "Yo, here's
> some ops you should know about"
> [21:36] <josephg> that would be similar to a mailbox
> [21:37] <josephg> ... and it would work pretty well. Bear in mind that
> there's no reason operations have to go through the wave server at
> example.com - if we're both on a LAN together, we could discover one
> another through DNS service discovery and send ops directly
> [21:37] <josephg> .. without going through our respective wave servers
> [21:38] <josephg> However - if our identities aren't tied to a domain (eg
> bitcoin), then we'll need to use a dht or something.
> [21:42] <stenyak> the conclussion i've arrived at is that "users"
> ultimately are a publickey (for which they have the privatekey). this is
> inconvenient for people to "add you to a wave", so a possibility would be
> to have a friendlyname=>pubkey server converter. this way people can add "
> stenyak@example.com", by first finding out what the pubkey for
> stenyak@example.com really is
> [21:43] <stenyak> the friendlyname would be optional, and in LAN
> environments you could directly use the pubkey (instead of the friendly
> name)
> [21:43] <josephg> I think people will be more than happy to use a frienly
> name in a lan environment too
> [21:43] <stenyak> discovery in a local network could be done with bonjour
> or something too (not just dns)
> [21:44] <josephg> I <3 dns-sd
> [21:44] <stenyak> [...] maybe they already have a contact list (read, list
> of friendlyname<>pubkey equivalences) they can use in the UI (even if the
> underlying system will use pubkeys anyway)
> [21:44] <stenyak> and by contact list, i really mean a cache of some sort
> [21:45] <stenyak> (not some specific, complex roster system)
> [21:45] <josephg> and you can do friendlyname -> pubkey really easily by
> just storing the pubkey on the user's domain
> [21:45] <josephg> so, have the example.com webserver host
> https://example.com/.wellknown/stenyak
> [21:46] <josephg> = your public key.
>
>
> *9) P2P anonymity (peers that want to anonymously lurk in a wave) (part
> 1):*
> [12:48] <stenyak> by the way, what about non-participants that simply want
> to lurk a wave?
> [12:49] <stenyak> e.g. i'm given a wave uri
> (wave://look_at_these_kittens_wave), and want to view it
> [12:49] <alown> Whilst a wave is  public, as soon as they 'read' the wave,
> they would have a metadata wavelet created, so would become a participant
> (if read-only).
> [12:50] <stenyak> and from then on, whenever the wave changes, someone will
> try to make the change reach the peers with my privkey
> [12:50] <stenyak> supposedly..
> *...9) P2P anonymity (peers that want to anonymously lurk in a wave) (part
> 2):*
> [21:18] <josephg> stenyak: interesting point about people who want to not
> participate but follow a wave anyway - its really bad if other people can
> tell that they're there (assuming the wave is public).
> [21:18] <josephg> I guess we just need to make sure that the metadata wave
> is invisible, and then its ok..
> [21:21] <stenyak> invisible.. to what peer/s? surely those that are
> transmitting deltas to the lurkers will need to know they exist?
> [21:21] <stenyak> (maybe some of the algorithms behind freenet can help
> with this)
> [21:21] <stenyak> (or even TOR)
>
>
> *10) Encryption of waves:*
> [21:47] <josephg> for waves themselves, I'm imagining giving each wave an
> AES key
> [21:47] <josephg> then storing an encrypted version of the key for each
> participant on the wave
> [21:48] <josephg> .... anyway, that way anyone who has the AES key can read
> all ops on the wave
> [21:48] <josephg> and can participate (because they can encrypt ops for the
> wave)
>
>
> *11) Addition and removal of participants, and their ability to read past
> and future wave versions/deltas:*
> [21:48] <stenyak> what about removing a user from a wave?
> [21:49] <josephg> worst case, we can just make a new key and re-add
> everyone using the new key
> [21:49] <josephg> and keep around the old key too
> [21:49] <josephg> so people can still read the old ops as well
> [21:49] <stenyak> the user can access their browser cache for all we care.
> if you ever read it, there will be ways to do it. "download now wave-spy to
> read waves you were removed from!"
> [21:49] <stenyak> so providing an official way sounds better
> [21:50] <stenyak> the AES key could change at any point in time, e.g.
> whenever a new users is added (to prevent them accessing the history), or
> deleting them (to prevent them from reading future history)
> [22:32] <josephg> um - in wave, we let new users see the whole history
> [22:40] <stenyak> but that use case could be desirable, right? and if we
> support modification/versioning of the AES key, we might as well allow that
> too? the equivalent in email world would be to forward an email, removing
> the existing quotes
> [23:17] <josephg> Yep definitely.
>
>
> --
> Saludos,
>      Bruno González
>
> _______________________________________________
> Jabber: stenyak AT gmail.com
> http://www.stenyak.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message