cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Javier Canillas <javier.canil...@gmail.com>
Subject Re: Node crashes on repair (Cassandra 3.11.1)
Date Thu, 30 Nov 2017 18:12:53 GMT
Christian,

I'm not an expert, but maybe the merkle tree is too big to transfer between
nodes and that's why it times out. How many nodes do you have and what's
the size of the keyspace? Have you ever done a successfully repair before?

Cassandra reaper does repair based on tokenrange (or even part of it),
that's why it can manage to require a small merkle tree.

Regards,

Javier.

2017-11-30 6:48 GMT-03:00 Christian Lorenz <Christian.Lorenz@webtrekk.com>:

> Hello,
>
>
>
> after updating our cluster to Cassandra 3.11.1 (previously 3.9) running a
> ‘nodetool repair –full’ leads to the node crashing.
>
> Logfile showed the following Exception:
>
> ERROR [ReadRepairStage:36] 2017-11-30 07:42:06,439
> CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:36,5,
> main]
>
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out
> - received only 0 responses.
>
>         at org.apache.cassandra.service.DataResolver$
> RepairMergeListener.close(DataResolver.java:199)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at org.apache.cassandra.db.partitions.
> UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:175)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:76)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:50)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[na:1.8.0_151]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[na:1.8.0_151]
>
>         at org.apache.cassandra.concurrent.NamedThreadFactory.
> lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>
>         at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
>
>
>
> The node datasize is ~270GB.  A repair with Cassandra reaper works fine
> though.
>
>
>
> Any idea why this could be happening?
>
>
>
> Regards,
>
> Christian
>

Mime
View raw message