cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Somani <meatfor...@gmail.com>
Subject Re: Best way to detect/fix bitrot today?
Date Tue, 08 Feb 2011 14:23:29 GMT
I should have clarified we have 3 copies, so in that case as long as 2 match
we should be ok?

Even if there were checksumming at the SStable level, I assume it has to
check and report these errors on compaction (or node repair)?

I have seen some JIRA open on these issues ( 47 and 1717), but if I need
something today, a read repair ( or a node repair) is the only viable
option?



On Mon, Feb 7, 2011 at 12:09 PM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> > Our application space is such that there is data that might not be read
> for
> > a long time. The data is mostly immutable. How should I approach
> > detecting/solving the bitrot problem? One approach is read data and let
> read
> > repair do the detection, but given the size of data, that does not look
> very
> > efficient.
>
> Note that read-repair is not really intended to repair arbitrary
> corruptions. Unless I'm mistaken, arbitrary corruption, unless it
> triggers a serialization failure that causes row skipping, it's a
> toss-up which version of the data is retained (or both, if the
> corruption is in the key). Given the same key and column timestamp,
> the tie breaker is the volumn value. So depending on whether
> corruption results in a "lesser" or "greater" value, you might get the
> corrupt or non-corrupt data.
>
> > Has anybody solved/workaround this or has any other suggestions to detect
> > and fix bitrot?
>
> My feel/tentative opinion is that the clean fix is for Cassandra to
> support strong checksumming at the sstable level.
>
> Deploying on e.g. ZFS would help a lot with this, but that's a problem
> for deployment on Linux (which is the recommended platform for
> Cassandra).
>
> --
> / Peter Schuller
>

Mime
View raw message