Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A015F108DE for ; Tue, 6 Jan 2015 09:48:34 +0000 (UTC) Received: (qmail 34868 invoked by uid 500); 6 Jan 2015 09:48:35 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 34818 invoked by uid 500); 6 Jan 2015 09:48:35 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 34806 invoked by uid 99); 6 Jan 2015 09:48:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jan 2015 09:48:35 +0000 Date: Tue, 6 Jan 2015 09:48:35 +0000 (UTC) From: "Benedict (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-8546) RangeTombstoneList becoming bottleneck on tombstone heavy tasks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-8546: -------------------------------- Assignee: (was: Benedict) > RangeTombstoneList becoming bottleneck on tombstone heavy tasks > --------------------------------------------------------------- > > Key: CASSANDRA-8546 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8546 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: 2.0.11 / 2.1 > Reporter: Dominic Letz > Fix For: 2.1.3 > > Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, rangetombstonelist_compaction.png, rangetombstonelist_mutation.png, rangetombstonelist_read.png, tombstone_test.tgz > > > I would like to propose a change of the data structure used in the RangeTombstoneList to store and insert tombstone ranges to something with at least O(log N) insert in the middle and at near O(1) and start AND end. Here is why: > When having tombstone heavy work-loads the current implementation of RangeTombstoneList becomes a bottleneck with slice queries. > Scanning the number of tombstones up to the default maximum (100k) can take up to 3 minutes of how addInternal() scales on insertion of middle and start elements. > The attached test shows that with 50k deletes from both sides of a range. > INSERT 1...110000 > flush() > DELETE 1...50000 > DELETE 110000...60000 > While one direction performs ok (~400ms on my notebook): > {code} > SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1 > {code} > The other direction underperforms (~7seconds on my notebook) > {code} > SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)