Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C14DDD923 for ; Fri, 29 Jun 2012 21:44:51 +0000 (UTC) Received: (qmail 16049 invoked by uid 500); 29 Jun 2012 21:44:51 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 15925 invoked by uid 500); 29 Jun 2012 21:44:51 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 15597 invoked by uid 99); 29 Jun 2012 21:44:44 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Jun 2012 21:44:44 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id C93AE141BDA for ; Fri, 29 Jun 2012 21:44:43 +0000 (UTC) Date: Fri, 29 Jun 2012 21:44:43 +0000 (UTC) From: "Nick Bailey (JIRA)" To: commits@cassandra.apache.org Message-ID: <2077166150.73499.1341006283826.JavaMail.jiratomcat@issues-vm> In-Reply-To: <200810271.72397.1340991404141.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (CASSANDRA-4396) Subcolumns not removed when compacting tombstoned super column MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404238#comment-13404238 ] Nick Bailey commented on CASSANDRA-4396: ---------------------------------------- This is also a problem with simply flushing super column deletes: >From a fresh cluster I can create a supercolumn with subcolumns, delete that supercolumn, trigger a flush with nodetool, and observer the subcolumn data in the flushed sstable with sstable2json. > Subcolumns not removed when compacting tombstoned super column > -------------------------------------------------------------- > > Key: CASSANDRA-4396 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4396 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Nick Bailey > Assignee: Jonathan Ellis > Fix For: 1.0.11, 1.1.3 > > > When we compact a tombstone for a super column with the old data for that super column, we end up writing the deleted super column and all the subcolumn data that is now worthless to the new sstable. This is especially inefficient when reads need to scan tombstones during a slice. > Here is the output of a simple test I ran to confirm: > insert supercolumn, then flush > {noformat} > Nicks-MacBook-Pro:12:20:52 cassandra-1.0] cassandra$ bin/sstable2json ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-1-Data.db > { > "6b657931": {"supercol1": {"deletedAt": -9223372036854775808, "subColumns": [["737562636f6c31","7468697320697320612074657374",1340990212532000]]}} > } > {noformat} > delete supercolumn, flush again > {noformat} > [Nicks-MacBook-Pro:12:20:59 cassandra-1.0] cassandra$ bin/nodetool -h localhost flush > [Nicks-MacBook-Pro:12:22:41 cassandra-1.0] cassandra$ bin/sstable2json ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-2-Data.db > { > "6b657931": {"supercol1": {"deletedAt": 1340990544005000, "subColumns": []}} > } > {noformat} > compact and check resulting sstable > {noformat} > [Nicks-MacBook-Pro:12:22:55 cassandra-1.0] cassandra$ bin/nodetool -h localhost compact > [Nicks-MacBook-Pro:12:23:09 cassandra-1.0] cassandra$ bin/sstable2json ~/.ccm/1node/node1/data/Keyspace2/Super4-hd-3-Data.db > { > "6b657931": {"supercol1": {"deletedAt": 1340990544005000, "subColumns": [["737562636f6c31","7468697320697320612074657374",1340990212532000]]}} > } > [Nicks-MacBook-Pro:12:23:20 cassandra-1.0] cassandra$ > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira