Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B31A9D89 for ; Wed, 2 May 2012 16:57:16 +0000 (UTC) Received: (qmail 77584 invoked by uid 500); 2 May 2012 16:57:15 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 77556 invoked by uid 500); 2 May 2012 16:57:15 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 77548 invoked by uid 99); 2 May 2012 16:57:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 May 2012 16:57:15 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 May 2012 16:57:13 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C949D42C518 for ; Wed, 2 May 2012 16:56:51 +0000 (UTC) Date: Wed, 2 May 2012 16:56:51 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1496440044.17779.1335977811845.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1462770822.12392.1335846466079.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-4205) SSTables are not updated with max timestamp on upgradesstables/compaction leading to non-optimal performance. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266705#comment-13266705 ] Jonathan Ellis commented on CASSANDRA-4205: ------------------------------------------- bq. A suggested fix is to special case this in upgradesstables so that a max timestamp always exists for all SSTables. Looking at the code, scrub and upgradesstables and user-defined compactions all force deserialize + maxtimestamp computation. The only operation that does not is cleanup. > SSTables are not updated with max timestamp on upgradesstables/compaction leading to non-optimal performance. > ------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-4205 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4205 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.0.0 > Reporter: Thorkild Stray > Assignee: Jonathan Ellis > Priority: Critical > Fix For: 1.0.10, 1.1.1 > > > We upgraded from 0.7.9 to 1.0.7 on a cluster with a heavy update load. After converting all the reads to named column reads instead of get_slice calls, we noticed that we still weren't getting the performance improvements implemented in CASSANDRA-2498. A single named column read was still touching multiple SSTables according to nodetool cfhistograms. > To verify whether or not this was a reporting issue or a real issue, we ran multiple tests with stress and noticed that it worked as expected. After changing stress so that it ran the read/write test directly in the CF having issues (3 times stress & flush), we noticed that stress also touched multiple SSTables (according to cfhistograms). > So, the root of the problem is "something" left over from our pre-1.0 days. All SSTables were upgraded with upgradesstables, and have been written and compacted many times since the upgrade (4 months ago). The usage pattern for this CF is that it is constantly read and updated (overwritten), but no deletes. > After discussing the problem with Brandon Williams on #cassandra, it seems the problem might be because a max timestamp has never been written for the old SSTables that were upgraded from pre 1.0. They have only been compacted, and the max timestamp is not recorded during compactions. > A suggested fix is to special case this in upgradesstables so that a max timestamp always exists for all SSTables. > {panel} > 06:08 < driftx> thorkild_: tx. The thing is we don't record the max timestamp on compactions, but we can do it specially for upgradesstables. > 06:08 < driftx> so, nothing in... nothing out. > 06:10 < thorkild_> driftx: ah, so when you upgrade from before the metadata was written, and that data is only feed through upgradesstables and compactions -> never properly written? > 06:10 < thorkild_> that makes sense. > 06:11 < driftx> right, we never create it, we just reuse it :( > {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira