cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10342) Read defragmentation can cause unnecessary repairs
Date Wed, 16 Sep 2015 06:35:46 GMT


Marcus Eriksson commented on CASSANDRA-10342:

Patch [here|] which only defragments
if we only read from unrepaired sstables.

But I wonder if we should keep doing defragmentation at all in 2.2+ where incremental repair
is default - we should probably do a few benchmarks if the over-repair is worth it. [~mambocab]
do you have cycles to run the benchmarks? I pushed a branch without the defragmentation [here|]
- we would need to run a mixed workload with a few incremental repairs thrown in and compare
read latency and amount of data streamed with standard 2.1

> Read defragmentation can cause unnecessary repairs
> --------------------------------------------------
>                 Key: CASSANDRA-10342
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Olsson
>            Priority: Minor
> After applying the fix from CASSANDRA-10299 to the cluster we started having a problem
of ~20k small sstables appearing for the table with static data when running incremental repair.
> In the logs there were several messages about flushes for that table, one for each repaired
range. The flushed sstables were 0.000kb in size with < 100 ops in each. When checking
cfstats there were several writes to that table, even though we were only reading from it
and read repair did not repair anything.
> After digging around in the codebase I noticed that defragmentation of data can occur
while reading, depending on the query and some other conditions. This causes the read data
to be inserted again to have it in a more recent sstable, which can be a problem if that data
was repaired using incremental repair. The defragmentation is done in [|].
> I guess this wasn't a problem with full repairs since I assume that the digest should
be the same even if you have two copies of the same data. But with incremental repair this
will most probably cause a mismatch between nodes if that data already was repaired, since
the other nodes probably won't have that data in their unrepaired set.
> ------
> I can add that the problems on our cluster was probably due to the fact that CASSANDRA-10299
caused the same data to be streamed multiple times and ending up in several sstables. One
of the conditions for the defragmentation is that the number of sstables read during a read
request have to be more than the minimum number of sstables needed for a compaction(> 4
in our case). So normally I don't think this would cause ~20k sstables to appear, we probably
hit an extreme.
> One workaround for this is to use another compaction strategy than STCS(it seems to be
the only affected strategy, atleast in 2.1), but the solution might be to either make defragmentation
configurable per table or avoid reinserting the data if any of the sstables involved in the
read are repaired.

This message was sent by Atlassian JIRA

View raw message