cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-10143) Apparent counter overcount during certain network partitions
Date Fri, 23 Oct 2015 15:52:28 GMT


Aleksey Yeschenko updated CASSANDRA-10143:
    Fix Version/s:     (was: 3.0.x)

> Apparent counter overcount during certain network partitions
> ------------------------------------------------------------
>                 Key: CASSANDRA-10143
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Joel Knighton
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.1.x, 2.2.x, 3.1
> This issue is reproducible in this [Jepsen Test|].
> The test starts a five-node cluster and issues increments by one against a single counter.
It then checks that the counter is in the range [OKed increments, OKed increments + Write
Timeouts] at each read. Increments are issued at CL.ONE and reads at CL.ALL.  Throughout the
test, network failures are induced that create halved network partitions. A halved network
partition splits the cluster into three connected nodes and two connected nodes, randomly.
> This test started failing; bisects showed that it was actually a test change that caused
this failure. When the network partitions are induced in a cycle of 15s healthy/45s partitioned
or 20s healthy/45s partitioned, the test failes. When network partitions are induced in a
cycle of 15s healthy/60s partitioned, 20s healthy/45s partitioned, or 20s healthy/60s partitioned,
the test passes.
> There is nothing unusual in the logs of the nodes for the failed tests. The results are
very reproducible.
> One noticeable trend is that more reads seem to get serviced during the failed tests.
> Most testing has been done in 2.1.8 - the same issue appears to be present in 2.2/3.0/trunk,
but I haven't spent as much time reproducing.
> Ideas?

This message was sent by Atlassian JIRA

View raw message