cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <>
Subject Re: Questions related to the data in SSTable files
Date Wed, 23 Oct 2013 15:39:38 GMT
On Wed, Oct 23, 2013 at 5:23 AM, java8964 java8964 <>wrote:

> We enabled the major repair on every node every 7 days.

This is almost certainly the cause of your many duplicates.

If you don't DELETE heavily, consider changing gc_grace_seconds to 34 days
and then doing a repair on the first of the month.

> If one node persistent a write, plus a "hint" of failed replication write,
> this write will still store as one write in its SSTable files, right? Why
> need to store 2 copies as duplication in SSTable files?

Write destined for replica nodes A B C.

Write comes into A.

Write "fails" but actually succeeds in replicating to B. A writes it as a

B flushes its memtable.

A then delivers hint to B, creating another copy of the identical write in
a memtable.

B then flushes this new memtable.

There are now two copies of the same write on disk.

> Here is the duplication count happened in our SSTable files. You can see a
> lot of data duplicate 2 times, but also some with even higher number. But
> max duplication count is 27, can one client retry 27 times?

This many duplicates are almost certainly a result of repair
over-repairing. Re-read this chunk from my previous mail :

> Repair has a fixed granularity, so the larger the size of your dataset the
> more "over-repair" any given "repair" will cause.
> Duplicates occur as a natural consequences of this, if you have 1 row
> which differs in the merkle tree chunk and the merkle tree chunk is, for
> example, 1000 rows.. you will "repair" one row and "duplicate" the other
> 999.

Question #2 from your original mail is also almost certainly a result of
"over-repair." The "duplicate" chunks can be from any time.

PS - What cassandra version?

View raw message