cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Ilinykh <>
Subject Re: Compact and Repair
Date Thu, 08 Nov 2012 18:20:02 GMT
Nothing unusual. When you run repair cassandra streams inconsistent regions
from all replicas. If you have wide rows or didn't run repair regularly it
is very easy to get 10-20% of extra data from each replica. What probably
happens in your case. Theoretically cassandra should compact new sstables
you get from other nodes. But, by default cassandra compacts sstables in
the same size tier. Because of major compaction you ran before, you have
one big sstable and a bunch of small. So, there is nothing to compact right
now. Eventually cassandra will compact them. But nobody knows when it will
happen. This is one of problems caused by major compaction. For maintenance
it is better to have a set of small sstables then one big.


On Thu, Nov 8, 2012 at 2:55 AM, Henrik Schröder <> wrote:

> Hi,
> We recently ran a major compaction across our cluster, which reduced the
> storage used by about 50%. This is fine, since we do a lot of updates to
> existing data, so that's the expected result.
> The day after, we ran a full repair -pr across the cluster, and when that
> finished, each storage node was at about the same size as before the major
> compaction. Why does that happen? What gets transferred to other nodes, and
> why does it suddenly take up a lot of space again?
> We haven't run repair -pr regularly, so is this just something that
> happens on the first weekly run, and can we expect a different result next
> week? Or does repair always cause the data to grow on each node? To me it
> just doesn't seem proportional?
> /Henrik

View raw message