cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-14843) Drop/add column name with different Kind can result in corruption
Date Thu, 29 Nov 2018 11:19:00 GMT


Sam Tunnicliffe updated CASSANDRA-14843:
    Status: Ready to Commit  (was: Patch Available)

> Drop/add column name with different Kind can result in corruption
> -----------------------------------------------------------------
>                 Key: CASSANDRA-14843
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Major
>             Fix For: 3.0.18
> While we have always imposed that the type of any column name remains consistent, we
have not done the same for its kind.  If a column’s kind is changed via Drop/Add, the SerializationHeader’s
column sets will be interpreted incorrectly, and may either lead to CorruptSSTableException
or silent data corruption.
> The problem occurs in SerializationHeader.Component.toHeader().  In this method, we lookup
columns from the metadata only by name, ignoring the static status.  If a column of the same
name but different kind to our dropped column exists in the schema, we will only add this
column to the list of fields we expect.  So we will be missing a column from our set of expected
static columns.  We will also add the regular variant to the set of regular columns present,
which it may not be.
> There are a number of ways this can play out:
> 1) There are no other static columns in the affected sstables.  In this case we will
incorrectly treat the table as containing no static data, so we will not attempt to read any
static row.  This leaves a static row to be read as the first regular row, and will throw
the exception in the description.
> 1a) If for some reason the static row is absent - say, it has expired - then depending
on lexicographical ordering, and if the sstable did not contain any instance of the regular
variant, we may misattribute column data.  This will probably result in corruption exceptions,
but if the columns were of the same type it could be undetectable.
> 2) There are other static columns on the table, and in the affected sstables.  In this
case, the row will be correctly treated as a static row, but depending on the dropped column’s
lexicographical position relative to these, it may result in column data being misattributed
to other columns.  This *could* be undetectable corruption, if the column types were the same.
> 3) Either of these above scenario could be replayed with the static/regular relations
> CASSANDRA-14591 would protect against 1a and 2.
> Probably the safest and correct fix is to prevent adding a column back with a different
kind to the original, however this would require storing more metadata about dropped columns.
 This is probably viable for 4.0, but for 3.0 perhaps too invasive. 
> It is quite easy to make this particular method safe against this particular corruption.
 However it is hard to be certain there aren’t other places where assumptions were made
about a name being of only one kind.  There are only a handful of places that *should* need
to be of concern, specifically those places that invoke CFMetaData.getDroppedColumn, so we
should be fairly confident that - if these call sites are safe - we are now insulated.  This
might suffice for 3.0.  Thoughts?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message