cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-6385) FD phi estimator initial conditions
Date Wed, 20 Nov 2013 17:50:35 GMT


Jonathan Ellis updated CASSANDRA-6385:

    Attachment: 6385-v2.txt

Thinking about it more, I'm not comfortable with saying that a dead node will *never* be detected
if it dies before it hits the cutoff.  v2 changes it to sqrt(phi) until it hits 30.

We could probably do something more sophisticated that narrows the fudge factor as we approach
our threshold of confidence.

(Both of these break ArrivalWindowTest, btw.)

> FD phi estimator initial conditions
> -----------------------------------
>                 Key: CASSANDRA-6385
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Quentin Conner
>         Attachments: 6385-v2.txt, 6385.txt
> phi estimates are calculated for newly discovered nodes from an un-filled (new, uninitialized)
> The inter-arrival time (elapsed time between gossip heartbeats) is stored in the o.a.c.gms.ArrivalWindow.arrivalIntervale
deque for each received heartbeat, up to the maximum window size of 1000 samples.
> In the o.a.c.gms.FailureDetector.interpret() method, phi is calculated for the node which
uses a statistical measure called variance.  Like mean, variance on a population (a set of
numbers or measurements) is not statistically relevant unless the population set size is 30
or greater. 
> When a new node is discovered, the calculated variance is higher than normal, and causes
phi to be higher than normal, resulting in a false positive failure detection.

This message was sent by Atlassian JIRA

View raw message