Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 29 Nov 2016 13:14:59 +0000 (UTC)
From: "Marcus Eriksson (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12819508.1428538488000.389093.1480425299695@Atlassian.JIRA>
In-Reply-To: <JIRA.12819508.1428538488000@Atlassian.JIRA>
References: <JIRA.12819508.1428538488000@Atlassian.JIRA> <JIRA.12819508.1428538488067@arcas>
Subject: [jira] [Commented] (CASSANDRA-9143) Improving consistency of
 repairAt field across replicas
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Tue, 29 Nov 2016 13:15:01 -0000


    [ https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15705274#comment-15705274 ] 

Marcus Eriksson commented on CASSANDRA-9143:
--------------------------------------------

Looks good in general - comments;

* Rename the cleanup compaction task, very confusing wrt the current cleanup compactions
* Should we prioritize the pending-repair-cleanup compactions?
** If we don't we might compare different datasets - a repair fails half way through and one node happens to move the pending data to unrepaired, operator retriggers repair and we would compare different datasets. If we instead move the data back as quickly as possible we minimize this window
** It would also help the next normal compactions as we might be able to include more sstables in the repaired/unrepaired strategies
* Is there any point in doing anticompaction after repair with -full repairs? Can we always do consistent repairs? We would need to anticompact already repaired sstables into pending, but that should not be a big problem?
* In CompactionManager#getSSTablesToValidate we still mark all unrepaired sstables as repairing - we don't need to do that for consistent repairs. And if we can do consistent repair for -full as well, all that code can be removed
* In handleStatusRequest - if we don't have the local session, we should probably return that the session is failed?
* Fixed some minor nits here: https://github.com/krummas/cassandra/commit/24ef8b2f6df98431d66519ee12452df3db84fd7d


> Improving consistency of repairAt field across replicas 
> --------------------------------------------------------
>
>                 Key: CASSANDRA-9143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9143
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Blake Eggleston
>
> We currently send an anticompaction request to all replicas. During this, a node will split stables and mark the appropriate ones repaired. 
> The problem is that this could fail on some replicas due to many reasons leading to problems in the next repair. 
> This is what I am suggesting to improve it. 
> 1) Send anticompaction request to all replicas. This can be done at session level. 
> 2) During anticompaction, stables are split but not marked repaired. 
> 3) When we get positive ack from all replicas, coordinator will send another message called markRepaired. 
> 4) On getting this message, replicas will mark the appropriate stables as repaired. 
> This will reduce the window of failure. We can also think of "hinting" markRepaired message if required. 
> Also the stables which are streaming can be marked as repaired like it is done now. 


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)