cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Hardouin <romainh...@yahoo.fr>
Subject Re: Nodetool repair
Date Thu, 22 Sep 2016 15:43:06 GMT
Hi,
@Matija: George wrote that he uses C* 2.0.9, so the Spotify master is OK for him :-) But
you're right about C* >= 2.1, we also use a fork to run it against our 2.1 clusters.
@George: your repair might be slow and not necessarily stuck.  As Alain said, check the progression
of nodetool netstats.Did you set streaming_socket_timeout_in_ms to a value different than
0?What is the value of request_timeout_in_ms?Also I suggest you to upgrade to the last 2.0.x
(i.e. 2.0.17). No need to upgrade SSTables but be sure to read https://github.com/apache/cassandra/blob/cassandra-2.0/NEWS.txtAgain,
you should have a look at cassandra-reaper and the GUI, you will have a progress bar to follow
the repair.
Finally if you want to kill a repair you can invoke forceTerminateAllRepairSessions with
jmxterm on each node:1. nodetool stop VALIDATION2. echo run -b org.apache.cassandra.db:type=StorageService
forceTerminateAllRepairSessions | java -jar /tmp/jmxterm/jmxterm-1.0-alpha-4-uber.jar -l 127.0.0.1:7199
jmxterm download: http://downloads.sourceforge.net/cyclops-group/jmxterm-1.0-alpha-4-uber.jar
Best,
Romain

Le Jeudi 22 septembre 2016 16h45, "Li, Guangxing" <guangxing.li@pearson.com> a écrit
:



Romain,

I had another repair that seems to just hang last night. When I did 'nodetool tpstats' on
nodes, I see the following in the node where I initiated the repair:
AntiEntropySessions               1         1
On all other nodes, I see:
AntiEntropySessions               0         0

When I check the log for pattern "session completed successfully" in system.log, I see the
last finished range occurred in 14 hours ago. So I think it is safe to say that the repair
has hanged somehow. In order to start another repair, do we need to 'kill' this repair. If
so, how do we do that?

Thanks.

George.


On Thu, Sep 22, 2016 at 6:23 AM, Romain Hardouin <romainh_ml@yahoo.fr> wrote:

I meant that pending (and active) AntiEntropySessions are a simple way to check if a repair
is still running on a cluster. Also have a look at Cassandra reaper:
>- https://github.com/spotify/ cassandra-reaper
>
>- https://github.com/ spodkowinski/cassandra-reaper- ui
>
>Best,
>Romain
>
>
>
>
>Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <guangxing.li@pearson.com>
a écrit :
>
>Romain,
>
>I started running a new repair. If I see such behavior again, I will try what you mentioned.
>
>Thanks.
>
Mime
View raw message