zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Asta <greg.a...@omnigon.com>
Subject RE: Why ZK uses TCP instead of UPD (IP multicast)?
Date Tue, 30 Dec 2014 16:30:39 GMT
I'll toss out here that multicast over IPSec tends to not work as well.  I haven't found any
way to get multicast to work across multiple data centers (granted, I haven't tried especially
hard to either).   

-----Original Message-----
From: Dave Katz [mailto:dkatz@dkatz.org] 
Sent: Friday, December 26, 2014 3:26 PM
To: user@zookeeper.apache.org
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: Why ZK uses TCP instead of UPD (IP multicast)?

On Dec 25, 2014, at 4:49 PM, Ibrahim <i.s.el-sanosi@newcastle.ac.uk> wrote:

> Yes you are right when you say "Reliable UDP isn't a defined standard".
> However, there are some protocols has implemented using UDP. For 
> example, some Red Hat apps, protocols, and framework implemented using 
> reliable UDP (their own reliable UDP), and they works fast and reliable.
> To answer your question, "what sort of improvements would you expect 
> from it, over TCP?"
> Zookeeper uses TCP protocol to send/receive messages. For example, 
> when the leader sends proposal to 4 followers, it sends it one by one, 
> meaning that it needs to send 4 messages (four outgoing packets). 
> Whereas, to achieve same thing using RUDP (IP multicast), here leader 
> only needs to send one messages (one outgoing packets), as a result, 
> it reduces the network traffic.

There's no free lunch with multicast;  making it reliable is either expensive (unicast acknowledgement
back to the source, which doesn't scale) or complex (ack aggregation, NACK schemes).  It also
makes for difficult and complex flow control mechanisms.

The fact that TCP has been stable for 30+ years, but reliable multicast is still in the realm
of experimentation, is a sign that it's not as easy as it seems.

View raw message