cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data
Date Thu, 04 Aug 2011 21:39:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079616#comment-13079616
] 

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/4/11 9:39 PM:
--------------------------------------------------------------------

This is a good idea but it has few complications:

 - buffer length should be stored in order to be used by reader
 - reads should be aligned by that buffer length so we always read a whole checksummed chunk
of the data which implies that we will potentially always need to read more data on each request

This seems to be a clear tradeoff between using additional space to store checksum for index
+ columns for each row v.s. doing more I/O...


      was (Author: xedin):
    This is a good idea but it has few complications:

 - buffer length should be store in order to be used by reader
 - reads should be aligned by that buffer length so we always read a whole checksummed chunk
of the data which implies that we will potentially always need to read more data on each request

This seems to be a clear tradeoff between using additional space to store checksum for index
+ columns for each row v.s. doing more I/O...

  
> Cassandra cannot detect corrupt-but-readable column data
> --------------------------------------------------------
>
>                 Key: CASSANDRA-1717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: checksums.txt
>
>
> Most corruptions of on-disk data due to bitrot render the column (or row) unreadable,
so the data can be replaced by read repair or anti-entropy.  But if the corruption keeps column
data readable we do not detect it, and if it corrupts to a higher timestamp value can even
resist being overwritten by newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message