cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Fong (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process
Date Fri, 23 Jun 2017 01:00:00 GMT


Michael Fong commented on CASSANDRA-11748:

Hi, [~mbyrd], 

Thanks for looking into this issue. 

If my memory serves me correctly, we observed that the number of schema migration request
and response message exchanged between two nodes is linearly related to the 
1. # of gossip message a node sent to the other node but yet responded since the other node
was in process of restarting. 
2. # of elapsed second that two nodes has been blocked for internal communication.

It is also true that we had *a lot* of table - over 500+ tables, and that makes each round
of schema migration more expensive.  Our workaround was to add a throttle control on # of
schema migration task requested in v2.0 source code, and that seemed to work just fine. This
makes more sense as each schema migration tasks requested a full copy of schema, as far as
I remember. Hence, requesting migration for 100+ times is likely inefficient per say.

Last but not least, the root cause of having different schema version is yet unknown, that
is, say its schema version is A, but having B as schema version after restarting the C* instance.
This happens seemly at random and uncertain how to reproduce. Our best guess is perhaps related
1. Some variant added to calculating schema hash is different - maybe timestamp after restarting
C* instances
2. Down to file system level where a schema migration task did not successfully flush onto
disk before killing the process.


Michael Fong

> Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade
> -----------------------------------------------------------------------------------------------
>                 Key: CASSANDRA-11748
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>            Reporter: Michael Fong
>            Assignee: Matt Byrd
>            Priority: Critical
>             Fix For: 3.0.x, 3.11.x, 4.x
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran into OOM in
bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version agreemnt
- via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different schema
> 4. All nodes in cluster start to rapidly exchange schema information, and any of node
could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test bed
> ----------------------------------
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 (line
328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 (line
328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 (line 328) Gossiping
my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the other node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 (line 1011) Node /
has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 (line 414) Updating topology
for /
> ...
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 (line 102) Submitting
migration task for /
> ... ( over 100+ times)
> ----------------------------------
> On the otherhand, Node 1 keeps updating its gossip information, followed by receiving
and submitting migrationTask afterwards: 
> INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 (line 978) InetAddress
/ is now UP
> ...
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 (line
41) Received migration request from /
> …… ( over 100+ times)
> DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 (line 127) submitting
migration task for /
> .....  (over 50+ times)
> On the side note, we have over 200+ column families defined in Cassandra database, which
may related to this amount of rpc traffic.
> P.S.2 The over requested schema migration task will eventually have InternalResponseStage
performing schema merge operation. Since this operation requires a compaction for each merge
and is much slower to consume. Thus, the back-pressure of incoming schema migration content
objects consumes all of the heap space and ultimately ends up OOM!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message