cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Theroux <>
Subject nodetool repair
Date Sat, 14 Jul 2012 18:00:15 GMT

I'm looking at nodetool repair with the "-pr", vs. non "-pr" option.  Looking around, I'm
seeing a lot of conflicting information out there.  Almost universally, the recommendation
is to run nodetool repair with the "-pr" for any day-to-day maintenance.

This is my understanding of how it works.  I appreciate any corrections to my misinformation.

nodetool repair -pr

- This performs a repair on the "primary range" of the node.  The primary range is essentially
the part of the ring that the node is responsible for.  When this command is run, synchronization
of replicas will occur for the rows that this node is responsible for.  If replicas are missing
from that node's neighbors for those rows, they will be replicated.

nodetool repair

- This is where I see a lot of conflicting information.  I see a lot of answers in which there
is a suggestion that this command will perform a repair across the entire cluster.  However,
I don't believe this is true from my observations (and some of the items I read seems to agree
with this).  Instead, this command performs synchronization of your primary range, but also
for other ranges that this node maybe responsible for in a replica capacity.  The way I'm
thinking about it is that the -pr option causes repairs to push information from its primary
range to replicas.  Without -pr, nodetool replair does a push, and pull for its neighbors
that this node maybe a replica for.  This makes sense to me, as people recommend running nodetool
repair after a node has been down.  This is to allow the downed node to get any missed information
that should have been replicated to it while it was down. 

I'm sure there lots of flaws to the above understanding as I'm cobbling it together.  I appreciate
the feedback,

View raw message