incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: frequent node UP/Down?
Date Sun, 25 Sep 2011 18:10:07 GMT
Thanks Brandon.

I'll try this.

but you can also see my later post regarding message drop :
http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3CCAAnh3_8AeHidYH9ybt82_EMH3LikbCDseNRak3JHfzaJ2L+9zQ@mail.gmail.com%3E

that seems to show something in either code or background load causing
messages to be really dropped


Yang

On Sun, Sep 25, 2011 at 10:59 AM, Brandon Williams <driftx@gmail.com> wrote:
> On Sun, Sep 25, 2011 at 12:52 PM, Yang <teddyyyy123@gmail.com> wrote:
>> Thanks Brandon.
>>
>> I suspected that, but I think that's precluded as a possibility since
>> I setup another background job to do
>> echo | nc other_box 7000
>> in a loop,
>> this job seems to be working fine all the time, so network seems fine.
>
> This isn't measuring latency, however.  That is how the failure
> detector works, using probability to estimate the likelihood that a
> given host is alive, based on previous history.  The situation on ec2
> is something like the following: 99% of pings are 1ms, but sometimes
> there are brief periods of 100ms, and this is where the FD says "this
> is not realistic, I think the host is dead" but then receives the
> ping, and thus the flapping.  I've seen it a million times, increasing
> the phi threshold always solves it.
>
> -Brandon
>

Mime
View raw message