Thanks Brandon.
I'll try this.
but you can also see my later post regarding message drop :
http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3CCAAnh3_8AeHidYH9ybt82_EMH3LikbCDseNRak3JHfzaJ2L+9zQ@mail.gmail.com%3E
that seems to show something in either code or background load causing
messages to be really dropped
Yang
On Sun, Sep 25, 2011 at 10:59 AM, Brandon Williams <driftx@gmail.com> wrote:
> On Sun, Sep 25, 2011 at 12:52 PM, Yang <teddyyyy123@gmail.com> wrote:
>> Thanks Brandon.
>>
>> I suspected that, but I think that's precluded as a possibility since
>> I setup another background job to do
>> echo | nc other_box 7000
>> in a loop,
>> this job seems to be working fine all the time, so network seems fine.
>
> This isn't measuring latency, however. That is how the failure
> detector works, using probability to estimate the likelihood that a
> given host is alive, based on previous history. The situation on ec2
> is something like the following: 99% of pings are 1ms, but sometimes
> there are brief periods of 100ms, and this is where the FD says "this
> is not realistic, I think the host is dead" but then receives the
> ping, and thus the flapping. I've seen it a million times, increasing
> the phi threshold always solves it.
>
> -Brandon
>
|