cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Node crashes on repair (Cassandra 3.11.1)
Date Thu, 30 Nov 2017 18:46:03 GMT
That was worded poorly. The depth has a max depth of 20, the tree is the
same size for any range > 2**20.


On Thu, Nov 30, 2017 at 10:43 AM, Jeff Jirsa <jjirsa@gmail.com> wrote:

> Merkle trees have a fixed size/depth (2**20), so it’s not that, but it
> could be timing out elsewhere (or still running validation or something)
>
> --
> Jeff Jirsa
>
>
> On Nov 30, 2017, at 10:12 AM, Javier Canillas <javier.canillas@gmail.com>
> wrote:
>
> Christian,
>
> I'm not an expert, but maybe the merkle tree is too big to transfer
> between nodes and that's why it times out. How many nodes do you have and
> what's the size of the keyspace? Have you ever done a successfully repair
> before?
>
> Cassandra reaper does repair based on tokenrange (or even part of it),
> that's why it can manage to require a small merkle tree.
>
> Regards,
>
> Javier.
>
> 2017-11-30 6:48 GMT-03:00 Christian Lorenz <Christian.Lorenz@webtrekk.com>
> :
>
>> Hello,
>>
>>
>>
>> after updating our cluster to Cassandra 3.11.1 (previously 3.9) running a
>> ‘nodetool repair –full’ leads to the node crashing.
>>
>> Logfile showed the following Exception:
>>
>> ERROR [ReadRepairStage:36] 2017-11-30 07:42:06,439
>> CassandraDaemon.java:228 - Exception in thread
>> Thread[ReadRepairStage:36,5,main]
>>
>> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed
>> out - received only 0 responses.
>>
>>         at org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:199)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.db.partitions.UnfilteredPartitionIterat
>> ors$2.close(UnfilteredPartitionIterators.java:175)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:76)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThr
>> ow(AsyncRepairCallback.java:50) ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> ~[na:1.8.0_151]
>>
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> ~[na:1.8.0_151]
>>
>>         at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$
>> threadLocalDeallocator$0(NamedThreadFactory.java:81)
>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
>>
>>
>>
>> The node datasize is ~270GB.  A repair with Cassandra reaper works fine
>> though.
>>
>>
>>
>> Any idea why this could be happening?
>>
>>
>>
>> Regards,
>>
>> Christian
>>
>
>

Mime
View raw message