Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 362F1200CCA for ; Wed, 19 Jul 2017 09:19:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 34930168563; Wed, 19 Jul 2017 07:19:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7B11416855D for ; Wed, 19 Jul 2017 09:19:05 +0200 (CEST) Received: (qmail 65482 invoked by uid 500); 19 Jul 2017 07:19:04 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 65468 invoked by uid 99); 19 Jul 2017 07:19:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jul 2017 07:19:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id BF744C0326 for ; Wed, 19 Jul 2017 07:19:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id EvFR5Mn0wLud for ; Wed, 19 Jul 2017 07:19:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 7C48E5FB61 for ; Wed, 19 Jul 2017 07:19:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D4F8CE0DCA for ; Wed, 19 Jul 2017 07:19:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 3760821EA6 for ; Wed, 19 Jul 2017 07:19:00 +0000 (UTC) Date: Wed, 19 Jul 2017 07:19:00 +0000 (UTC) From: "Jeff Jirsa (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 19 Jul 2017 07:19:06 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13696: ----------------------------------- Fix Version/s: 4.x 3.11.x 3.0.x > Digest mismatch Exception if hints file has UnknownColumnFamily > --------------------------------------------------------------- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Jay Zhuang > Assignee: Jay Zhuang > Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 HintsDispatchExecutor.java:234 - Failed to dispatch hints file a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch exception > at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) ~[main/:na] > at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) ~[main/:na] > at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[main/:na] > at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) ~[main/:na] > at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) ~[main/:na] > at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) ~[main/:na] > at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) ~[main/:na] > at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) [main/:na] > at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) [main/:na] > at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) [main/:na] > at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) [main/:na] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111] > at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) ~[main/:na] > at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. > Here is the reproduce steps on a 3 nodes cluster, RF=3: > 1. stop node1 > 2. send some data with quorum (or one), it will generate hints file on node2/node3 > 3. drop the table > 4. start node1 > node2/node3 will report "corrupted hints file" and stop. The impact is very bad for a large cluster, when it happens, almost all the nodes are down at the same time and we have to remove all the hints files (which contain the dropped table) to bring the node back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org