Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 799C9200B7C for ; Thu, 25 Aug 2016 00:55:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 782C4160AB1; Wed, 24 Aug 2016 22:55:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BE6B2160AC1 for ; Thu, 25 Aug 2016 00:55:21 +0200 (CEST) Received: (qmail 67539 invoked by uid 500); 24 Aug 2016 22:55:20 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 67270 invoked by uid 99); 24 Aug 2016 22:55:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Aug 2016 22:55:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A31602C0032 for ; Wed, 24 Aug 2016 22:55:20 +0000 (UTC) Date: Wed, 24 Aug 2016 22:55:20 +0000 (UTC) From: "Paulo Motta (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-9143) Improving consistency of repairAt field across replicas MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 24 Aug 2016 22:55:22 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435885#comment-15435885 ] Paulo Motta commented on CASSANDRA-9143: ---------------------------------------- bq. Since sstables compacted since the beginning of a repair are excluded from anticompaction, normal compaction is enough to create large inconsistencies of the data each node considers repaired. This will cause repaired data to be considered unrepaired, which will cause a lot of unnecessary streaming on the next repair. While this is a relevant problem, it sounds slightly different from the original problem description, which is to improve the consistency of the repairedAt field, which can become inconsistent when a node fails mid-anti-compaction at the end of the parent repair session. Do you plan to tackle only the original problem, or also the problem of losing repair information from compacted sstables during repair (which is a bit harder problem)? bq. We do the anticompaction up front, but put the anticompacted data into the pending bucket. How do you plan to perform anti-compaction up-front? As Marcus pointed out, we defer anti-compaction to the end of the parent repair session to avoid re-anti-compacting multi-range sstables as repair progresses, so we need to have a strategy here to avoid or minimize that. But we could perhaps let operators trade-off increased I/O for more accurate repair information with anti-compaction check-points during long-running repairs. So I propose we start with the original idea of adding a 2PC to anti-compaction as suggested in the ticket description and perhaps on the top of that pursue anti-compaction checkpoints/hints in separate ticket? > Improving consistency of repairAt field across replicas > -------------------------------------------------------- > > Key: CASSANDRA-9143 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9143 > Project: Cassandra > Issue Type: Improvement > Reporter: sankalp kohli > Assignee: Blake Eggleston > Priority: Minor > > We currently send an anticompaction request to all replicas. During this, a node will split stables and mark the appropriate ones repaired. > The problem is that this could fail on some replicas due to many reasons leading to problems in the next repair. > This is what I am suggesting to improve it. > 1) Send anticompaction request to all replicas. This can be done at session level. > 2) During anticompaction, stables are split but not marked repaired. > 3) When we get positive ack from all replicas, coordinator will send another message called markRepaired. > 4) On getting this message, replicas will mark the appropriate stables as repaired. > This will reduce the window of failure. We can also think of "hinting" markRepaired message if required. > Also the stables which are streaming can be marked as repaired like it is done now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)