Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C21BB18029 for ; Tue, 26 Apr 2016 22:15:13 +0000 (UTC) Received: (qmail 67176 invoked by uid 500); 26 Apr 2016 22:15:13 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 67145 invoked by uid 500); 26 Apr 2016 22:15:13 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 67054 invoked by uid 99); 26 Apr 2016 22:15:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2016 22:15:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 4F0B92C1F60 for ; Tue, 26 Apr 2016 22:15:13 +0000 (UTC) Date: Tue, 26 Apr 2016 22:15:13 +0000 (UTC) From: "Wei Deng (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-11656) sstabledump has inconsistency in deletion_time printout MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259049#comment-15259049 ] Wei Deng commented on CASSANDRA-11656: -------------------------------------- I tested out the patch [~cnlwsu] provided in CASSANDRA-11655. However, I still see some discrepancies like the following: {noformat} ~/cassandra-trunk/tools/bin/sstabledump ma-15-big-Data.db [ { "partition" : { "key" : [ "1" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 18, "clustering" : [ "c1" ], "liveness_info" : { "tstamp" : 1461646542601774 }, "cells" : [ { "name" : "val0_int", "deletion_info" : { "tstamp" : 1461649343 }, "tstamp" : 1461649343000508 }, { "name" : "val1_set_of_int", "deletion_info" : { "deletion_time" : 1461647295880443, "tstamp" : 1461647295 } }, { "name" : "val1_set_of_int", "path" : [ "1" ], "deletion_info" : { "tstamp" : 1461647320 }, "tstamp" : 1461647320160261 }, { "name" : "val1_set_of_int", "path" : [ "10" ], "value" : "", "tstamp" : 1461647295880444 }, { "name" : "val1_set_of_int", "path" : [ "11" ], "value" : "", "tstamp" : 1461647295880444 }, { "name" : "val1_set_of_int", "path" : [ "12" ], "value" : "", "tstamp" : 1461647295880444 } ] }, { "type" : "row", "position" : 86, "clustering" : [ "c2" ], "deletion_info" : { "deletion_time" : 1461647588089843, "tstamp" : 1461647588 }, "cells" : [ ] }, { "type" : "row", "position" : 101, "clustering" : [ "c4" ], "liveness_info" : { "tstamp" : 1461649635932899 }, "cells" : [ ] }, { "type" : "row", "position" : 114, "clustering" : [ "c5" ], "liveness_info" : { "tstamp" : 1461650266651050, "ttl" : 60, "expires_at" : 1461650326, "expired" : true }, "cells" : [ { "name" : "val0_int", "value" : "500", "tstamp" : 1461650241403672 }, { "name" : "val1_set_of_int", "deletion_info" : { "deletion_time" : 1461650241403671, "tstamp" : 1461650241 } }, { "name" : "val1_set_of_int", "path" : [ "111" ], "value" : "", "tstamp" : 1461650241403672 }, { "name" : "val1_set_of_int", "path" : [ "222" ], "value" : "", "tstamp" : 1461650241403672 }, { "name" : "val1_set_of_int", "path" : [ "333" ], "value" : "", "tstamp" : 1461650241403672 } ] }, { "type" : "row", "position" : 180, "clustering" : [ "c6" ], "deletion_info" : { "deletion_time" : 1461708091029189, "tstamp" : 1461708091 }, "cells" : [ ] } ] } ] {noformat} IMHO if we decide to use tstamp to represent timestamp of the writes (whether it's a delete or a regular mutation), then it should always be microseconds since epoch (16 digits), and it should be consistent across regular cells and tombstones. In my view, the "deletion_time" can be a good short name for localDeletionTime (which only guides compaction to do GC) and as long as we are consistent across the board and always use that to represent localDeletionTime that has only 10 digits (seconds since epoch), it's good to me too. > sstabledump has inconsistency in deletion_time printout > ------------------------------------------------------- > > Key: CASSANDRA-11656 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11656 > Project: Cassandra > Issue Type: Bug > Components: Tools > Reporter: Wei Deng > Labels: Tools > > See the following output (note the deletion info under the second row): > {noformat} > [ > { > "partition" : { > "key" : [ "1" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 18, > "clustering" : [ "c1" ], > "liveness_info" : { "tstamp" : 1461646542601774 }, > "cells" : [ > { "name" : "val0_int", "deletion_time" : 1461647421, "tstamp" : 1461647421344759 }, > { "name" : "val1_set_of_int", "path" : [ "1" ], "deletion_time" : 1461647320, "tstamp" : 1461647320160261 }, > { "name" : "val1_set_of_int", "path" : [ "10" ], "value" : "", "tstamp" : 1461647295880444 }, > { "name" : "val1_set_of_int", "path" : [ "11" ], "value" : "", "tstamp" : 1461647295880444 }, > { "name" : "val1_set_of_int", "path" : [ "12" ], "value" : "", "tstamp" : 1461647295880444 } > ] > }, > { > "type" : "row", > "position" : 85, > "clustering" : [ "c2" ], > "deletion_info" : { "deletion_time" : 1461647588089843, "tstamp" : 1461647588 }, > "cells" : [ ] > } > ] > } > ] > {noformat} > To avoid confusion, we need to have consistency in printing out the DeletionTime object. By definition, markedForDeleteAt is in microseconds since epoch and marks the time when the "delete" mutation happens; localDeletionTime is in seconds since epoch and allows GC to collect the tombstone if the current epoch second is greater than localDeletionTime + gc_grace_seconds. I'm ok to use "tstamp" to represent markedForDeleteAt because markedForDeleteAt does represent this delete mutation's timestamp, but we need to be consistent everywhere. -- This message was sent by Atlassian JIRA (v6.3.4#6332)