cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Blake Eggleston (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14861) sstable min/max metadata can cause data loss
Date Thu, 01 Nov 2018 17:53:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671934#comment-16671934
] 

Blake Eggleston commented on CASSANDRA-14861:
---------------------------------------------

[~benedict], [~beobal], [~iamaleksey] would one or more of you be interested in reviewing?

> sstable min/max metadata can cause data loss
> --------------------------------------------
>
>                 Key: CASSANDRA-14861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14861
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Major
>             Fix For: 3.0.18, 3.11.4, 4.0
>
>
> There’s a bug in the way we filter sstables in the read path that can cause sstables
containing relevant range tombstones to be excluded from reads. This can cause data resurrection
for an individual read, and if compaction timing is right, permanent resurrection via read
repair. 
> We track the min and max clustering values when writing an sstable so we can avoid reading
from sstables that don’t contain the clustering values we’re looking for in a given read.
The min max for each clustering column are updated for each row / RT marker we write. In the
case of range tombstones markers though, we only update the min max for the clustering values
they contain, which is almost never the full set of clustering values. This leaves a min/max
that are above/below (respectively) the real ranges covered by the range tombstone contained
in the sstable.
> For instance, assume we’re writing an sstable for a table with 3 clustering values.
The current min clustering is 5:6:7. We write an RT marker for a range tombstone that deletes
any row with the value 4 in the first clustering value so the open marker is [4:]. This would
make the new min clustering 4:6:7 when it should really be 4:. If we do a read for clustering
values of 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, resurrecting
any data there that this tombstone would have deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message