Why without –PR when recovering from crash?

 

Repair without –PR runs full repair of the cluster, the node which receives a command is a repair controller, ALL nodes synchronizes replicas at the same time, streaming data between each other.

The problems may arise:

·         When streaming hangs (it tends to hang even on a stable network), repair session hangs (any version does re-stream?)

·         Network will be highly saturated

·         In case of high inconsistency some nodes may receive a lot of data, disk usage much more than 2x (depends on RF)

·         A lot of compactions will be pending

 

IMO, best way to run repair is from script with –PR for single CF from single node at a time and monitoring progress, like:

repair -pr node1 ks1 cf1

repair -pr node2 ks1 cf1

repair -pr node3 ks1 cf1

repair -pr node1 ks1 cf2

repair -pr node2 ks1 cf2

repair -pr node3 ks1 cf2

With some progress or other control in between, your choice.

 

Use repair with care, do not let your cluster go down.

 

 

 



Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

From: R. Verlangen [mailto:robin@us2.nl]
Sent: Monday, June 04, 2012 15:17
To: user@cassandra.apache.org
Subject: Re: repair

 

The "repair -pr" only repairs the nodes primary range: so is only usefull in day to day use. When you're recovering from a crash use it without -pr.

2012/6/4 Romain HARDOUIN <romain.hardouin@urssaf.fr>


Run "repair -pr" in your cron.

Tamar Fraenkel <tamar@tok-media.com> a écrit sur 04/06/2012 13:44:32 :

> Thanks. 

>
> I actually did just that with cron jobs running on different hours.

>
> I asked the question because I saw that when one of the logs was
> running the repair, all nodes logged some repair related entries in
> /var/log/cassandra/system.log

>
> Thanks again,

> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 



 

--
With kind regards,

 

Robin Verlangen

Software engineer

 

www.robinverlangen.nl

E robin@us2.nl

 

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.