Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C69917E3A for ; Mon, 1 Jun 2015 13:08:18 +0000 (UTC) Received: (qmail 21903 invoked by uid 500); 1 Jun 2015 13:08:18 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 21868 invoked by uid 500); 1 Jun 2015 13:08:18 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 21856 invoked by uid 99); 1 Jun 2015 13:08:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2015 13:08:17 +0000 Date: Mon, 1 Jun 2015 13:08:17 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-9486) LazilyCompactedRow accumulates all expired RangeTombstones MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567277#comment-14567277 ] Sylvain Lebresne commented on CASSANDRA-9486: --------------------------------------------- As Benedict mentions above, expired tombstone are only part of the problem, and we have 2 other problems: 1) only cells "trim" the tracker and 2) we don't pass shadowed cells to the tracker. As a result, if we have many RT and the only cells we have are deleted by those RT, all RT ends up accumulated, even if we wouldn't need to. Let's maybe fix it all here. On top of that, the tracker class is admittedly a bit ugly already and adding more special casing for expired tombstones doesn't particularly help. So I've pushed [here|https://github.com/pcmanus/cassandra/commits/9486] a suggested alternative that solves the problems described above and, I think, simplify the code in the process (partly thanks to commenting it). > LazilyCompactedRow accumulates all expired RangeTombstones > ---------------------------------------------------------- > > Key: CASSANDRA-9486 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9486 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Benedict > Assignee: Marcus Eriksson > Priority: Critical > Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x, 1.2.x > > Attachments: 0001-9486.patch > > > LazilyCompactedRow initializes a ColumnIndex.Builder to use its RangeTombstone.Tracker, but it only calls update() with a RT argument, never an atom. The Tracker only ever _adds_ if it receives a RT, never removes. So all the RT ever seen for the partition (that have expired) remain in memory until the compaction completes. To make matters worse, this then forces a linear scan of all of these RT for each live cell we add, so this extra load hangs around for a long time, and compactions stall. > This issue is biting one of our users badly (at least, it seems likely to be this issue), and there may be others. This user is not even making use of RT extensively themselves, only collections (presumably with a complete overwrite of the contents of the collection, resulting in a RT being generated). > Probably the best solution is to make the RT addition itself remove any already present that are no longer helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)