cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5454) Changing column_index_size_in_kb on different nodes might corrupt files
Date Thu, 27 Jun 2013 14:30:20 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-5454:
----------------------------------------

    Attachment: 5454.txt

Attaching patch for this. As said above, this basically revert the changes from CASSANDRA-5418,
which is ok now that we don't write the row size or column count at the start of the row.

I've checked that the test added by Yukim for CASSANDRA-5418 does pass with this patch. 
                
> Changing column_index_size_in_kb on different nodes might corrupt files
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-5454
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5454
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.0
>
>         Attachments: 5454.txt
>
>
> RangeTombstones requires that we sometimes repeat a few markers in the data file at index
boundaries. Meaning that the same row with different column_index_size_in_kb will not have
the same data size.
> This is a problem for streaming, because if the column_index_size_in_kb is different
in the source and the destination, the resulting row should have a different size on the destination,
but streaming rely on the data size not changing in 1.2.
> Now, while having different column_index_size on different nodes is probably not extremely
useful in the long run, you may still have temporal discrepancies because there is no real
way to change the setting on all node atomically. Besides, it's not to hard to get different
setting on different nodes due to human error. And currently, the result is that if a file
is stream while the setting is not consistent, then we'll end up corrupting the received file
(due to the fix from CASSANDRA-5418 to be precise).
> I don't see a good way to fix this in 1.2, so users will have to be careful not to have
streaming happening while they change the column_index_size_in_kb setting. But in 2.0, once
CASSANDRA-4180 is committed, we won't have the problem of having to respect the dataSize from
the source on the destination anymore. So basically we should revert the fix from CASSANDRA-5418
(though we may still want to avoid repeating unneeded marker, but the tombstoneTracker can
give us that easily).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message