cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shalom sagges <>
Subject Re: AbstractLocalAwareExecutorService Exception During Upgrade
Date Wed, 19 Jun 2019 09:37:47 GMT
Hi Again,

Trying to push this up as I wasn't able to find the root cause of this
Perhaps I need to upgrade to 3.0 first?
Will be happy to get some ideas.

Opened with more


On Thu, Jun 6, 2019 at 5:31 AM Jonathan Koppenhofer <>

> Not sure about why repair is running, but we are also seeing the same
> merkle tree issue in a mixed version cluster in which we have intentionally
> started a repair against 2 upgraded DCs. We are currently researching, and
> can post back if we find the issue, but also would appreciate if someone
> has a suggestion. We have also run a local repair in an upgraded DC in this
> same mixed version cluster without issue.
> We are going 2.1.x to 3.0.x... and yes, we know you are not supposed to
> run repairs in mixed version clusters, so don't do it :) this is kind of a
> special circumstances where other things have gone wrong.
> Thanks
> On Wed, Jun 5, 2019, 5:23 PM shalom sagges <> wrote:
>> If anyone has any idea on what might cause this issue, it'd be great.
>> I don't understand what could trigger this exception.
>> But what I really can't understand is why repairs started to run suddenly
>> :-\
>> There's no cron job running, no active repair process, no Validation
>> compactions, Reaper is turned off....  I see repair running only in the
>> logs.
>> Thanks!
>> On Wed, Jun 5, 2019 at 2:32 PM shalom sagges <>
>> wrote:
>>> Hi All,
>>> I'm having a bad situation where after upgrading 2 nodes (binaries only)
>>> from 2.1.21 to 3.11.4 I'm getting a lot of warnings as follows:
>>> - Uncaught exception on
>>> thread Thread[ReadStage-5,5,main]: {}
>>> java.lang.ArrayIndexOutOfBoundsException: null
>>> I also see errors on repairs but no repair is running at all. I verified
>>> this with ps -ef command and nodetool compactionstats. The error I see is:
>>> Failed creating a merkle tree for [repair
>>> #a95498f0-8783-11e9-b065-81cdbc6bee08 on system_auth/users, []], /
>>> (see log for details)
>>> I saw repair errors on data tables as well.
>>> nodetool status shows all are UN and nodetool describecluster shows two
>>> schema versions as expected.
>>> After the warnings appeared, clients started to get timed out read/write
>>> queries.
>>> Restarting the 2 nodes solved the clients' connection issues, but the
>>> warnings are still being generated in the logs.
>>> Did anyone encounter such an issue and knows what this means?
>>> Thanks!

View raw message