cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hannu Kröger <hkro...@gmail.com>
Subject Repair fails for unknown reason
Date Wed, 03 Jan 2018 16:23:11 GMT
Hello,

Situation is as follows:

Repair was started on node X on this keyspace with —full —pr. Repair fails on node Y.

Node Y has debug logging on (DEBUG on org.apache.cassandra) and I’m looking at the debug.log.
I see following messages related to this repair request:

-----------
DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,530 RepairMessageVerbHandler.java:114 - Validating
ValidationRequest{gcBefore=1511473932} org.apache.cassandra.repair.messages.ValidationRequest@5a17430c
DEBUG [ValidationExecutor:4] 2018-01-02 17:52:12,531 StorageService.java:3321 - Forcing flush
on keyspace mykeyspace, CF mytable
DEBUG [MemtablePostFlush:54] 2018-01-02 17:52:12,531 ColumnFamilyStore.java:954 - forceFlush
requested but everything is clean in mytable
ERROR [ValidationExecutor:4] 2018-01-02 17:52:12,532 Validator.java:268 - Failed creating
a merkle tree for [repair #1df000a0-effa-11e7-8361-b7c9edfbfc33 on mykeyspace/mytable, [(6917529027641081856,-9223372036854775808]]],
/123.123.123.123 (see log for details)
-----------

then the same about another table and after that which indicates that repair “master”
has told to abort basically, right?

-----------
DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,563 RepairMessageVerbHandler.java:142 - Got
anticompaction request AnticompactionRequest{parentRepairSession=1de949e0-effa-11e7-8361-b7c9edfbfc33}
org.apache.cassandra.repair.messages.AnticompactionRequest@5dc8be
ea
ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,563 RepairMessageVerbHandler.java:168 - Got
error, removing parent repair session
ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,564 CassandraDaemon.java:228 - Exception in
thread Thread[AntiEntropyStage:1,5,main]
java.lang.RuntimeException: java.lang.RuntimeException: Parent repair session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33
has failed.
        at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:171)
~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.11.0.jar:3.11.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_111]
        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
[apache-cassandra-3.11.0.jar:3.11.0]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
Caused by: java.lang.RuntimeException: Parent repair session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33
has failed.
        at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:409)
~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.service.ActiveRepairService.doAntiCompaction(ActiveRepairService.java:444)
~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:143)
~[apache-cassandra-3.11.0.jar:3.11.0]
        ... 7 common frames omitted
-----------

But that is almost all in the log and I don’t really see what the original problem here
is. 

Cassandra flushes the table to start building merkle tree and on next millisecond it already
fails the repair but without proper exception or error logging about the problem.

Cassandra version is the 3.11.0.

Any ideas?

Cheers,
Hannu
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Mime
View raw message