cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Knighton (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10231) Null status entries on nodes that crash during decommission of a different node
Date Fri, 09 Oct 2015 20:17:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951117#comment-14951117
] 

Joel Knighton edited comment on CASSANDRA-10231 at 10/9/15 8:16 PM:
--------------------------------------------------------------------

I think the force blocking flush approach behavior is the least invasive and most likely to
ensure correctness.

With log entries, I've confirmed that my suspected behavior occurs.  Before commitlog replay,
we {{populateTokenMetadata}} for node1, node2, and node3.  After commitlog replay, when we
{{populateTokenMetadata}}, we only consider node2 and node3.  node1 stays present in the {{tokenMetadata}}.

I pushed a branch [10231-alternate|https://github.com/jkni/cassandra/commits/10231-alternate]
with a {{forceBlockingFlush}} only in {{removeEndpoint}}. I'll create a follow-up ticket to
further discuss the use of {{forceBlockingFlush}} for other {{PEERS}}-related methods in SystemKeyspace.

With this change, the attached dtest passes.

In CI, there are no unit test failures out of the ordinary.

In CI, there is only one dtest failure outside of historically flappy tests/tests with known
problems.
This failure is in {{commitlog_test.TestCommitLog.stop_failure_policy_test}} and is reproducible
locally. In the original patch, upon commitlog failure, when gossip was shutdown, we would
notify {{onChange}} which in {{handleStateNormal}} would {{updateTokens}} for the local node,
which would call {{removeEndpoint}}, causing the thread to hang in {{forceBlockingFlush}}
(due to the aforementioned commitlog failure).

Looking at git history, it seems this {{removeEndpoint}} is precautionary and there is currently
no gossip transition that results in the local node being present in {{PEERS}}. As a result,
I've removed this call from {{updateTokens}}, so the above commitlog test passes. This commit
has been pushed to the branch [10231-alternate|https://github.com/jkni/cassandra/commits/10231-alternate].
 The attached dtest still passes, as expected.

I'm waiting for CI to finish for this change; in the meantime, any feedback or review would
be great.



was (Author: jkni):
I think the force blocking flush approach behavior is the least invasive and most likely to
ensure correctness.

With log entries, I've confirmed that my suspected behavior occurs.  Before commitlog replay,
we {{populateTokenMetadata}} for node1, node2, and node3.  After commitlog replay, when we
{{populateTokenMetadata}}, we only consider node2 and node3.  node1 stays present in the {{tokenMetadata}}.

I pushed a branch [10231-alternate|https://github.com/jkni/cassandra/commits/10231-alternate]
with a {{forceBlockingFlush}} only in {{removeEndpoint}}. I'll create a follow-up ticket to
further discuss the use of {{forceBlockingFlush}} for other {{PEERS}}-related methods in SystemKeyspace.

In CI, there are no unit test failures out of the ordinary.

In CI, there is only one dtest failure outside of historically flappy tests/tests with known
problems.
This failure is in {{commitlog_test.TestCommitLog.stop_failure_policy_test}} and is reproducible
locally. In the original patch, upon commitlog failure, when gossip was shutdown, we would
notify {{onChange}} which in {{handleStateNormal}} would {{updateTokens}} for the local node,
which would call {{removeEndpoint}}, causing the thread to hang in {{forceBlockingFlush}}
(due to the aforementioned commitlog failure).

Looking at git history, it seems this {{removeEndpoint}} is precautionary and there is currently
no gossip transition that results in the local node being present in {{PEERS}}. As a result,
I've removed this call from {{updateTokens}}, so the above commitlog test passes. This commit
has been pushed to the branch [10231-alternate|https://github.com/jkni/cassandra/commits/10231-alternate].

I'm waiting for CI to finish for this change; in the meantime, any feedback or review would
be great.


> Null status entries on nodes that crash during decommission of a different node
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10231
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10231
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Joel Knighton
>            Assignee: Joel Knighton
>             Fix For: 3.0.0 rc2
>
>         Attachments: n1.log, n2.log, n3.log, n4.log, n5.log
>
>
> This issue is reproducible through a Jepsen test of materialized views that crashes and
decommissions nodes throughout the test.
> In a 5 node cluster, if a node crashes at a certain point (unknown) during the decommission
of a different node, it may start with a null entry for the decommissioned node like so:
> DN 10.0.0.5 ? 256 ? null rack1
> This entry does not get updated/cleared by gossip. This entry is removed upon a restart
of the affected node.
> This issue is further detailed in ticket [10068|https://issues.apache.org/jira/browse/CASSANDRA-10068].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message