Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9806C17FE4 for ; Wed, 30 Sep 2015 13:48:12 +0000 (UTC) Received: (qmail 88014 invoked by uid 500); 30 Sep 2015 13:48:04 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 87978 invoked by uid 500); 30 Sep 2015 13:48:04 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 87965 invoked by uid 99); 30 Sep 2015 13:48:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Sep 2015 13:48:04 +0000 Date: Wed, 30 Sep 2015 13:48:04 +0000 (UTC) From: "T Jake Luciani (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10413) Replaying materialized view updates from commitlog after node decommission crashes Cassandra MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936847#comment-14936847 ] T Jake Luciani commented on CASSANDRA-10413: -------------------------------------------- bq. I'm not yet sure that selecting the local address as the view replica is always appropriate in this situation I agree, it seems like if the node is not joined we have no clue if the gossip state is stale or not. In which case we should always apply the batchlog. So rather than do this check in getViewNaturalEndpoint() add the gossip state check directly in mutateMV and force it through the batchlog only. We don't even need to do the async updates since they will fail also. > Replaying materialized view updates from commitlog after node decommission crashes Cassandra > -------------------------------------------------------------------------------------------- > > Key: CASSANDRA-10413 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10413 > Project: Cassandra > Issue Type: Bug > Reporter: Joel Knighton > Assignee: T Jake Luciani > Priority: Critical > Fix For: 3.0.0 rc2 > > Attachments: n1.log, n2.log, n3.log, n4.log, n5.log > > > This issue is reproducible through a Jepsen test, runnable as > {code} > lein with-profile +trunk test :only cassandra.mv-test/mv-crash-subset-decommission > {code} > This test crashes/restarts nodes while decommissioning nodes. These actions are not coordinated. > In [10164|https://issues.apache.org/jira/browse/CASSANDRA-10164], we introduced a change to re-apply materialized view updates on commitlog replay. > Some nodes, upon restart, will crash in commitlog replay. They throw the "Trying to get the view natural endpoint on a non-data replica" runtime exception in getViewNaturalEndpoint. I added logging to getViewNaturalEndpoint to show the results of replicationStrategy.getNaturalEndpoints for the baseToken and viewToken. > It can be seen that these problems occur when the baseEndpoints and viewEndpoints are identical but do not contain the broadcast address of the local node. > For example, a node at 10.0.0.5 crashes on replay of a write whose base token and view token replicas are both [10.0.0.2, 10.0.0.4, 10.0.0.6]. It seems we try to guard against this by considering pendingEndpoints for the viewToken, but this does not appear to be sufficient. > I've attached the system.logs for a test run with added logging. In the attached logs, n1 is at 10.0.0.2, n2 is at 10.0.0.3, and so on. 10.0.0.6/n5 is the decommissioned node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)