Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 746D5184D9 for ; Wed, 12 Aug 2015 15:32:47 +0000 (UTC) Received: (qmail 22370 invoked by uid 500); 12 Aug 2015 15:32:47 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 22339 invoked by uid 500); 12 Aug 2015 15:32:47 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 22325 invoked by uid 99); 12 Aug 2015 15:32:47 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Aug 2015 15:32:47 +0000 Date: Wed, 12 Aug 2015 15:32:47 +0000 (UTC) From: "Branimir Lambov (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693685#comment-14693685 ] Branimir Lambov commented on CASSANDRA-9749: -------------------------------------------- [Branch|https://github.com/blambov/cassandra/tree/9749-2.2] updated. bq. Boolean.getBoolean() can check the system property Done. bq. What is the motivation for changing this to warn? Is it going to cause operators concern that is unwarranted? The patch includes a general heightening of the message severity for replay problems (what was warnings before are now errors, even when they are ignored). Let me know if you prefer to keep it as info. bq. Do you have to make all those object arrays for handleReplayError? varargs won't handle it correctly? Apparently eclipse does this when you ask it to inline a method with varargs. Fixed, apologies for not noticing it myself. bq. Some of the error conditions that now are supposed to throw don't have unit test coverage. They weren't tested before either, but this an opportunity to make sure the errors work. Fixed the existing tests which were only hitting the invalid descriptor path and added tests for the newer log formats that include a descriptor. > CommitLogReplayer continues startup after encountering errors > ------------------------------------------------------------- > > Key: CASSANDRA-9749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9749 > Project: Cassandra > Issue Type: Bug > Reporter: Blake Eggleston > Assignee: Branimir Lambov > Fix For: 2.2.x > > > There are a few places where the commit log recovery method either skips sections or just returns when it encounters errors. > Specifically if it can't read the header here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298 > Or if there are compressor problems here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314 and here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366 > Whether these are user-fixable or not, I think we should require more direct user intervention (ie: fix what's wrong, or remove the bad file and restart) since we're basically losing data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)