cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charlie Groves (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5351) Avoid repairing already-repaired data by default
Date Mon, 20 May 2013 23:59:16 GMT


Charlie Groves commented on CASSANDRA-5351:

Won't nodes other than the coordinator only have portions of their sstables repaired? If node
A is coordinating repair and is responsible for keys 1-50 and node B is responsible for keys
26-75, B won't get repairs for 51-75 during the repair A runs. If B marked its sstables as
repaired, it'd never repair 51-75. 

Also, if A is coordinating and there are replicas for 26-50 other than B, B won't get repairs
for that range from replicas other than A. It would still have invalid data in that range
if one of the other replicas had the correct data.
> Avoid repairing already-repaired data by default
> ------------------------------------------------
>                 Key: CASSANDRA-5351
>                 URL:
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>             Fix For: 2.0
> Repair has always built its merkle tree from all the data in a columnfamily, which is
guaranteed to work but is inefficient.
> We can improve this by remembering which sstables have already been successfully repaired,
and only repairing sstables new since the last repair.  (This automatically makes CASSANDRA-3362
much less of a problem too.)
> The tricky part is, compaction will (if not taught otherwise) mix repaired data together
with non-repaired.  So we should segregate unrepaired sstables from the repaired ones.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message