cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-6385) FD phi estimator initial conditions
Date Wed, 20 Nov 2013 19:00:35 GMT


Jonathan Ellis updated CASSANDRA-6385:

    Attachment: 6385-v3.txt

Thinking about it more, I think the main problem is using too low of an initial value to seed
the Window.  Interval / 2 is always smaller then the actual mean will be, and it will be increasingly
too small as the cluster size grows.

Picking a nice large value there gives us the "large fudge to start that "decays" (by being
averaged with real values) as we get more data" behavior that we want.

v3 attached.

> FD phi estimator initial conditions
> -----------------------------------
>                 Key: CASSANDRA-6385
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Quentin Conner
>             Fix For: 1.2.13, 2.0.3
>         Attachments: 6385-v2.txt, 6385-v3.txt, 6385.txt
> phi estimates are calculated for newly discovered nodes from an un-filled (new, uninitialized)
> The inter-arrival time (elapsed time between gossip heartbeats) is stored in the o.a.c.gms.ArrivalWindow.arrivalIntervale
deque for each received heartbeat, up to the maximum window size of 1000 samples.
> In the o.a.c.gms.FailureDetector.interpret() method, phi is calculated for the node which
uses a statistical measure called variance.  Like mean, variance on a population (a set of
numbers or measurements) is not statistically relevant unless the population set size is 30
or greater. 
> When a new node is discovered, the calculated variance is higher than normal, and causes
phi to be higher than normal, resulting in a false positive failure detection.

This message was sent by Atlassian JIRA

View raw message