cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Deng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11656) sstabledump has inconsistency in deletion_time printout
Date Thu, 28 Apr 2016 22:16:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263139#comment-15263139
] 

Wei Deng commented on CASSANDRA-11656:
--------------------------------------

I checked out the [latest patch|https://issues.apache.org/jira/secure/attachment/12800990/trunk-11655v2.patch].
The markedForDeleteAt and localDeletionTime are represented correctly in my examples now.

[~cnlwsu], I like that you've also added the functionality of printing actual human-readable
time format in the output. However, IMHO I feel it will be more appropriate to keep the existing
epoch format by default (as that has been how time was presented with sstable2json and people
are familiar with that, and it helps to print out some details down to the microsecond level,
which are useful in some troubleshooting scenarios), and just enable the human-readable timestamp
output when you specifically use "-t" option.

To give an example why the old epoch format could be useful, just take a look at the following
output on a set<int> column:

{noformat}
        "cells" : [
          { "name" : "val0_int", "value" : "500", "tstamp" : "1461650241403672" },
          { "name" : "val1_set_of_int", "deletion_info" : { "marked_deleted" : "1461650241403671",
"local_delete_time" : "1461650241" } },
          { "name" : "val1_set_of_int", "path" : [ "111" ], "value" : "", "tstamp" : "1461650241403672"
},
          { "name" : "val1_set_of_int", "path" : [ "222" ], "value" : "", "tstamp" : "1461650241403672"
},
          { "name" : "val1_set_of_int", "path" : [ "333" ], "value" : "", "tstamp" : "1461650241403672"
}
        ]
{noformat}

Note the set<int> column called "val1_set_of_int" above has the deletion_info at the
beginning. This has been the tradition even before CASSANDRA-8099 storage engine rewrite.
Basically the markedForDeleteAt timestamp at the collection level is just one microsecond
less than any of the timestamps at the collection element level (previously in pre-3.0 world
it's always a range tombstone at the beginning but still that range tombstone is always marked
with a timestamp that's exactly one microsecond less than the live cells). If we expose the
human-readable timestamp by default, then this kind of information is not immediately visible.
Of course you can still choose to use "-t", so I'm kind of ok either way, but I'd like to
point this out in any case.


> sstabledump has inconsistency in deletion_time printout
> -------------------------------------------------------
>
>                 Key: CASSANDRA-11656
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11656
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Wei Deng
>              Labels: Tools
>
> See the following output (note the deletion info under the second row):
> {noformat}
> [
>   {
>     "partition" : {
>       "key" : [ "1" ],
>       "position" : 0
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 18,
>         "clustering" : [ "c1" ],
>         "liveness_info" : { "tstamp" : 1461646542601774 },
>         "cells" : [
>           { "name" : "val0_int", "deletion_time" : 1461647421, "tstamp" : 1461647421344759
},
>           { "name" : "val1_set_of_int", "path" : [ "1" ], "deletion_time" : 1461647320,
"tstamp" : 1461647320160261 },
>           { "name" : "val1_set_of_int", "path" : [ "10" ], "value" : "", "tstamp" : 1461647295880444
},
>           { "name" : "val1_set_of_int", "path" : [ "11" ], "value" : "", "tstamp" : 1461647295880444
},
>           { "name" : "val1_set_of_int", "path" : [ "12" ], "value" : "", "tstamp" : 1461647295880444
}
>         ]
>       },
>       {
>         "type" : "row",
>         "position" : 85,
>         "clustering" : [ "c2" ],
>         "deletion_info" : { "deletion_time" : 1461647588089843, "tstamp" : 1461647588
},
>         "cells" : [ ]
>       }
>     ]
>   }
> ]
> {noformat}
> To avoid confusion, we need to have consistency in printing out the DeletionTime object.
By definition, markedForDeleteAt is in microseconds since epoch and marks the time when the
"delete" mutation happens; localDeletionTime is in seconds since epoch and allows GC to collect
the tombstone if the current epoch second is greater than localDeletionTime + gc_grace_seconds.
I'm ok to use "tstamp" to represent markedForDeleteAt because markedForDeleteAt does represent
this delete mutation's timestamp, but we need to be consistent everywhere.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message