cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: R: Re: Re: AntiEntropy?
Date Tue, 12 Jul 2011 18:29:13 GMT
> Running nodetool repair causes Cassandra to execute a major compaction
This is not what I would call factually accurate. Repair does not run a major compaction.
Major compaction is when all SSTables for a CF are compacted down to one SSTable. 


Aaron Morton
Freelance Cassandra Developer

On 12 Jul 2011, at 10:09, wrote:

>> The book is wrong, at least by current versions of Cassandra (I'm
>> basing that on the quote you pasted, I don't know the context).
> To be sure that I didn't misunderstand (English is not my mother tongue) here 
> is what the entire "repair paragraph" says ...
> Basic Maintenance
> There are a few tasks that you’ll need to perform before or after more 
> impactful tasks.
> For example, it makes sense to take a snapshot only after you’ve performed a 
> flush. So
> in this section we look at some of these basic maintenance tasks: repair, 
> snapshot, and
> cleanup.
> Repair
> Running nodetool repair causes Cassandra to execute a major compaction. A 
> Merkle
> tree of the data on the target node is computed, and the Merkle tree is 
> compared with
> those of other replicas. This step makes sure that any data that might be out 
> of sync
> with other nodes isn’t forgotten.
> During a major compaction (see “Compaction” in the Glossary), the server 
> initiates a
> TreeRequest/TreeReponse conversation to exchange Merkle trees with neighboring
> nodes. The Merkle tree is a hash representing the data in that column family. 
> If the
> trees from the different nodes don’t match, they have to be reconciled (or 
> “repaired”)
> in order to determine the latest data values they should all be set to. This 
> tree compar-
> ison validation is the responsibility of the org.apache.cassandra.service.
> AntiEntropy
> Service class. AntiEntropyService implements the Singleton pattern and defines 
> the
> static Differencer class as well, which is used to compare two trees. If it 
> finds any
> differences, it launches a repair for the ranges that don’t agree.
> So although Cassandra takes care of such matters automatically on occasion, 
> you can
> run it yourself as well.
>> nodetool repair must be scheduled by the operator to run regularly.
>> The name "repair" is a bit unfortunate; it is not meant to imply that
>> it only needs to run when something is "wrong".
>> -- 
>> / Peter Schuller

View raw message