Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 80071E8A4 for ; Thu, 14 Mar 2013 22:12:15 +0000 (UTC) Received: (qmail 75277 invoked by uid 500); 14 Mar 2013 22:12:15 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 75237 invoked by uid 500); 14 Mar 2013 22:12:15 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 75227 invoked by uid 99); 14 Mar 2013 22:12:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 22:12:15 +0000 Date: Thu, 14 Mar 2013 22:12:15 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-5351) Avoid repairing already-repaired data by default MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5351: -------------------------------------- Description: Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. was: Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. > Avoid repairing already-repaired data by default > ------------------------------------------------ > > Key: CASSANDRA-5351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 > Project: Cassandra > Issue Type: Task > Components: Core > Reporter: Jonathan Ellis > Labels: repair > Fix For: 2.0 > > > Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. > We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) > The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira