incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: Why data tripled in size after repair?
Date Thu, 27 Sep 2012 16:52:16 GMT
> I don't understand why it copied data twice. In worst case scenario it
> should copy everything (~90G)

Sadly no, repair is currently peer-to-peer based (there is a ticket to
fix it: https://issues.apache.org/jira/browse/CASSANDRA-3200, but
that's not trivial). This mean that you can end up with RF times the
data after a repair. Obviously that should be a worst case scenario as
it implies everything is repaired, but at least the triplicate part is
a problem, but a know and not so easy to fix one.

Is it possible that each time you've ran repair, one of the node in
the cluster was very out of sync with the other nodes. Maybe a node
that has crashed for a long time?

--
Sylvain

Mime
View raw message