Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C0928200498 for ; Tue, 29 Aug 2017 20:05:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BEECA167415; Tue, 29 Aug 2017 18:05:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 10330167414 for ; Tue, 29 Aug 2017 20:05:09 +0200 (CEST) Received: (qmail 20541 invoked by uid 500); 29 Aug 2017 18:05:08 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 20530 invoked by uid 99); 29 Aug 2017 18:05:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Aug 2017 18:05:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 3BB931A4E81 for ; Tue, 29 Aug 2017 18:05:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id RZB8Og_zfzoY for ; Tue, 29 Aug 2017 18:05:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id CD2DC5FE37 for ; Tue, 29 Aug 2017 18:05:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 81338E0C00 for ; Tue, 29 Aug 2017 18:05:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4869E24174 for ; Tue, 29 Aug 2017 18:05:00 +0000 (UTC) Date: Tue, 29 Aug 2017 18:05:00 +0000 (UTC) From: "Ariel Weisberg (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-13785) Compaction fails for SSTables with large number of keys MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 29 Aug 2017 18:05:10 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-13785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145799#comment-16145799 ] Ariel Weisberg commented on CASSANDRA-13785: -------------------------------------------- New stuff and summarizing. * So you are the unlucky person to catch SafeMemoryWriter misbehaving. If we want our test hygiene to be good you would be the person to write a basic set of unit tests for it. Go through each external method in SafeMemoryWriter and test it's API (null handling, largest value, smallest value, negative value, 0) and internal state (0 length, empty buffer, full buffer, etc.). * This will still hit the same issue if the decorated key size is larger than expected right? I know almost everyone uses fixed length keys, but it looks fragile to not at least fail fast if they use variable length keys or if the size changes in future iterations. * This is a bug in {{SafeMemoryWriter.length()}}? * Even if this throws AssertionError it still shouldn't leak memory right? We should log the error, but not leak. I think (I'll check) that we don't treat assertion failures as fatal and crash the process. > Compaction fails for SSTables with large number of keys > ------------------------------------------------------- > > Key: CASSANDRA-13785 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13785 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Reporter: Jay Zhuang > Assignee: Jay Zhuang > > Every a few minutes there're "LEAK DTECTED" messages in the log: > {noformat} > ERROR [Reference-Reaper:1] 2017-08-18 17:18:40,357 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3ed22d7) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1022568824:[Memory@[0..159b6ba4), Memory@[0..d8123468)] was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2017-08-18 17:20:49,693 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6470405b) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@97898152:[Memory@[0..159b6ba4), Memory@[0..d8123468)] was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2017-08-18 17:22:38,519 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6fc4af5f) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1247404854:[Memory@[0..159b6ba4), Memory@[0..d8123468)] was not released before the reference was garbage collected > {noformat} > Debugged the issue and found it's triggered by failed compactions, if the compacted SSTable has more than 51m {{Integer.MAX_VALUE / 40}}) keys, it will fail to create the IndexSummary: [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84]. > Cassandra compaction tried to compact every a few minutes and keeps failing. > The root cause is while [creating SafeMemoryWriter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L112] with {{> Integer.MAX_VALUE}} space, it returns the tailing {{Integer.MAX_VALUE}} space [SafeMemoryWriter.java:83|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/util/SafeMemoryWriter.java#L83], which makes the first [entries.length()|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L173] not 0. So the assert fails here: [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84] -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org