cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sumsion (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8169) Background bitrot detector to avoid client exposure
Date Thu, 23 Oct 2014 20:01:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181852#comment-14181852
] 

John Sumsion commented on CASSANDRA-8169:
-----------------------------------------

Yes, that would probably be sufficient.  As long as it doesn't do any writes, just reads and
a report of status.

If there is sstable checksum corruption, I would expect the corrupt sstable policy to be triggered,
killing the node or ignoring, based on the config.

Does that mean that we just kill this issue in favor of CASSANDRA-5791?  I think so.  Unless
anyone sees any value in keeping this issue alive (vs CASSANDRA-5791), I'll close this issue
in a couple days to give time for feedback.

> Background bitrot detector to avoid client exposure
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8169
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8169
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: John Sumsion
>
> With a lot of static data sitting in SSTables, and with only a relatively small add/edit
rate, incremental repair sounds very good.  However, there is one significant cost to switching
away from full repair.
> If/when bitrot corrupts an SSTable, there is nothing standing between a user query and
a corruption/failure-response event except for the other replicas.  This combined with a rolling
restart or upgrade can make a token range non-writable via quorum CL.
> While you could argue that full repairs should be scheduled on a longer-term regular
basis, I don't really care about all the repair overhead, I just want something that can run
ahead of user queries whose only responsibility is to detect bitrot, so that I can replace
nodes in an aggressive way instead of having it be a failure-response situation.
> This bitrot detector need not incur the full cross-cluster cost of repair, and so would
be less of a burden to run periodically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message