cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (CASSANDRA-8703) incremental repair vs. bitrot
Date Thu, 05 Feb 2015 23:21:34 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Jirsa reassigned CASSANDRA-8703:
-------------------------------------

    Assignee: Jeff Jirsa

> incremental repair vs. bitrot
> -----------------------------
>
>                 Key: CASSANDRA-8703
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8703
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Robert Coli
>            Assignee: Jeff Jirsa
>
> Incremental repair is a great improvement in Cassandra, but it does not contain a feature
that non-incremental repair does : protection against bitrot.
> Scenario :
> 1) repair SSTable, marking it repaired
> 2) cosmic ray hits hard drive, corrupting a record in SSTable
> 3) range is actually unrepaired as of the time that SSTable was repaired, but thinks
it is repaired
> From my understanding, if bitrot is detected (via eg the CRC on the read path) then all
SSTables containing the corrupted range needs to be marked unrepaired on all replicas. Per
marcuse@IRC, the naive/simplest response would be to just trigger a full repair in this case.
> I am concerned about incremental repair as an operational default while it does not handle
this case. As an aside, this would also seem to require a new CRC on the uncompressed read
path, as otherwise one cannot detect the corruption without periodic checksumming of SSTables.
Alternately, a "nodetool checksum" function which verified table checksums, marking ranges
unrepaired on failure, and which could be run every gc_grace_seconds would seem to meet the
requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message