We have observed this. But in practice it doesn't cause any deleterious effects. IMHO detecting false failures of nodes is the most dangerous thing that could result from this kind of behavior. But that is why we have an Accrual FD which reacts and adjusts to these conditions. But having said that moving TCP is not a bad option at all at relatively small scale.
Hey guys! I have a simple question. I'm a casual observer, not a real Cassandra user yet. So, excuse my ignorance.
I see that the Gossip feature uses UDP. I was curious to know if you guys faced issues with unreliable transports in your production clusters? Like faulty switches, dropped packets etc during heavy network loads?
If I'm not mistaken are all client reads/writes doing point-to-point over TCP?