Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 10883 invoked from network); 25 Mar 2011 01:12:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Mar 2011 01:12:43 -0000 Received: (qmail 29567 invoked by uid 500); 25 Mar 2011 01:12:43 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 29520 invoked by uid 500); 25 Mar 2011 01:12:43 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 29512 invoked by uid 99); 25 Mar 2011 01:12:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Mar 2011 01:12:43 +0000 X-ASF-Spam-Status: No, hits=-1999.7 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD,URIBL_RHS_DOB X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Mar 2011 01:12:42 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 26CD74C37C for ; Fri, 25 Mar 2011 01:12:06 +0000 (UTC) Date: Fri, 25 Mar 2011 01:12:06 +0000 (UTC) From: "Benjamin Coverston (JIRA)" To: commits@cassandra.apache.org Message-ID: <1888851870.10505.1301015526155.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1786203103.8755.1299097477089.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CASSANDRA-2261) During Compaction, Corrupt SSTables with rows that cause failures should be identified and blacklisted. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Coverston updated CASSANDRA-2261: ------------------------------------------ Attachment: (was: 2261.txt) > During Compaction, Corrupt SSTables with rows that cause failures should be identified and blacklisted. > ------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-2261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2261 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 0.6 > Reporter: Benjamin Coverston > Assignee: Benjamin Coverston > Priority: Minor > Labels: not_a_pony > Fix For: 0.7.5 > > Attachments: 2261.patch > > > When a compaction of a set of SSTables fails because of corruption it will continue to try to compact that SSTable causing pending compactions to build up. > One way to mitigate this problem would be to log the error, then identify the specific SSTable that caused the failure, subsequently blacklisting that SSTable and ensuring that it is no longer included in future compactions. For this we could simply store the problematic SSTable's name in memory. > If it's not possible to identify the SSTable that caused the issue, then perhaps blacklisting the (ordered) permutation of SSTables to be compacted together is something that can be done to solve this problem in a more general case, and avoid issues where two (or more) SSTables have trouble compacting a particular row. For this option we would probably want to store the lists of the bad combinations in the system table somewhere s.t. these can survive a node failure (there have been a few cases where I have seen a compaction cause a node failure). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira