cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <>
Subject Re: frequent node UP/Down?
Date Sun, 25 Sep 2011 18:10:07 GMT
Thanks Brandon.

I'll try this.

but you can also see my later post regarding message drop :

that seems to show something in either code or background load causing
messages to be really dropped


On Sun, Sep 25, 2011 at 10:59 AM, Brandon Williams <> wrote:
> On Sun, Sep 25, 2011 at 12:52 PM, Yang <> wrote:
>> Thanks Brandon.
>> I suspected that, but I think that's precluded as a possibility since
>> I setup another background job to do
>> echo | nc other_box 7000
>> in a loop,
>> this job seems to be working fine all the time, so network seems fine.
> This isn't measuring latency, however.  That is how the failure
> detector works, using probability to estimate the likelihood that a
> given host is alive, based on previous history.  The situation on ec2
> is something like the following: 99% of pings are 1ms, but sometimes
> there are brief periods of 100ms, and this is where the FD says "this
> is not realistic, I think the host is dead" but then receives the
> ping, and thus the flapping.  I've seen it a million times, increasing
> the phi threshold always solves it.
> -Brandon

View raw message