cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Carrino (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6897) Add checksum to the Summary File and Bloom Filter file of SSTables
Date Fri, 21 Mar 2014 19:47:44 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943458#comment-13943458
] 

John Carrino edited comment on CASSANDRA-6897 at 3/21/14 7:46 PM:
------------------------------------------------------------------

I understand that the mmap path means that the actual sstable cannot contain checksums unless
it is compressed.  We (my clusters) compress all tables for this reason.  We need to detect
failures fast and early as we cannot afford any data loss and need to repair right away. 

I would also like to add that the Index file should have check-summing on each entry because
a corruption in that file may mean that bogus data is read and returned.  Maybe not on each
entry, but every "block" (512B - 1KB) starting at the entry points from the summary file.
 

I think this will go a long way towards piece of mind that cassandra is returning the right
results even on hardware that may have issues.



was (Author: johnyoh):
I understand that the mmap path means that the actual sstable cannot contain checksums unless
it is compressed.  We compress all tables for this reason.  We need to detect failures fast
and early as we cannot afford any data loss and need to repair right away. 

I would also like to add that the Index file should have check-summing on each entry because
a corruption in that file may mean that bogus data is read and returned.  Maybe not on each
entry, but every "block" (512B - 1KB) starting at the entry points from the summary file.
 

I think this will go a long way towards piece of mind that cassandra is returning the right
results even on hardware that may have issues.


> Add checksum to the Summary File and Bloom Filter file of SSTables
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-6897
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6897
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Adam Hattrell
>
> Could we add a checksum to the Summary file and filter file of the SSTable. 
> Since reads the whole bloom filter before actually reading data, it seems like it would
make sense to checksum the bloom filter to make sure there is no corruption there. Same is
true with the summary file. The core of our question is, can you add checksumming to all elements
of the SSTable so if we read anything corrupt we immediately see a failure?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message