Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 628E920049E for ; Thu, 10 Aug 2017 13:40:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 60DD116B244; Thu, 10 Aug 2017 11:40:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 80C2816B241 for ; Thu, 10 Aug 2017 13:40:06 +0200 (CEST) Received: (qmail 57111 invoked by uid 500); 10 Aug 2017 11:40:05 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 57100 invoked by uid 99); 10 Aug 2017 11:40:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Aug 2017 11:40:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1D70EC1712 for ; Thu, 10 Aug 2017 11:40:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id smJIjxiw94yn for ; Thu, 10 Aug 2017 11:40:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 871B95FE64 for ; Thu, 10 Aug 2017 11:40:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3C6D5E0E14 for ; Thu, 10 Aug 2017 11:40:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 777FD24175 for ; Thu, 10 Aug 2017 11:40:00 +0000 (UTC) Date: Thu, 10 Aug 2017 11:40:00 +0000 (UTC) From: "Stefan Podkowinski (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 10 Aug 2017 11:40:07 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-11748?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D16121506#comment-16121506 ]=20 Stefan Podkowinski commented on CASSANDRA-11748: ------------------------------------------------ I'm not sure introducing a hard cap on pending outgoing pull requests and s= imply dopping anything from there is the way to go here. The good thing abo= ut the approach is that it's pretty much stateless, except from the atomic = counter. But we should at least take the schema Ids and/or endpoints into a= ccount as well. It just doesn't make sense to queue 50 requests for the sam= e schema Id and potentially drop requests for a different schema afterwards= . Also as already noted, issuing pulls in parallel is probably not what we = want, as this could lead to the described OOM issue, when too many response= s get queued and applied at the same time. So I think we don't get around m= anaging some more state, such as schema Ids, endpoints, last request time, = delay, .., that we can use to schedule pulls in a more efficient way, by do= ing one request after another.=20 But we should also not forget to look at the receiver side for incoming pul= l requests. Joining the cluster with a schema mismatch should not cause a n= ode to answer each of those in parallel. If we keep track of pending incomi= ng schema requests, we could introduce a delay before responding and create= the schema mutations just once as payload to be used for all of them. We m= ight have to bump up the MIGRATION_REQUEST timeout a in that case, but othe= rwise just delaying a few seconds should make a notable difference for node= s joining the cluster and having to answer to many migration requests in a = short time frame. > Schema version mismatch may leads to Casandra OOM at bootstrap during a r= olling upgrade process > -------------------------------------------------------------------------= ---------------------- > > Key: CASSANDRA-11748 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1174= 8 > Project: Cassandra > Issue Type: Bug > Environment: Rolling upgrade process from 1.2.19 to 2.0.17.=20 > CentOS 6.6 > Occurred in different C* node of different scale of deployment (2G ~ 5G) > Reporter: Michael Fong > Assignee: Matt Byrd > Priority: Critical > Fix For: 3.0.x, 3.11.x, 4.x > > > We have observed multiple times when a multi-node C* (v2.0.17) cluster ra= n into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0= .17.=20 > Here is the simple guideline of our rolling upgrade process > 1. Update schema on a node, and wait until all nodes to be in schema vers= ion agreemnt - via nodetool describeclulster > 2. Restart a Cassandra node > 3. After restart, there is a chance that the the restarted node has diffe= rent schema version. > 4. All nodes in cluster start to rapidly exchange schema information, and= any of node could run into OOM.=20 > The following is the system.log that occur in one of our 2-node cluster t= est bed > ---------------------------------- > Before rebooting node 2: > Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 MigrationManager= .java (line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a= 94f58f > Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 MigrationManager= .java (line 328) Gossiping my schema version 4cb463f8-5376-3baf-8e88-a5cc6a= 94f58f > After rebooting node 2,=20 > Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line = 328) Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b > The node2 keeps submitting the migration task over 100+ times to the oth= er node. > INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) No= de /192.168.88.33 has restarted, now UP > INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414= ) Updating topology for /192.168.88.33 > ... > DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line= 102) Submitting migration task for /192.168.88.33 > ... ( over 100+ times) > ---------------------------------- > On the otherhand, Node 1 keeps updating its gossip information, followed = by receiving and submitting migrationTask afterwards:=20 > INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line= 978) InetAddress /192.168.88.34 is now UP > ... > DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 MigrationRequestVerbHand= ler.java (line 41) Received migration request from /192.168.88.34. > =E2=80=A6=E2=80=A6 ( over 100+ times) > DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 MigrationManager.java (li= ne 127) submitting migration task for /192.168.88.34 > ..... (over 50+ times) > On the side note, we have over 200+ column families defined in Cassandra = database, which may related to this amount of rpc traffic. > P.S.2 The over requested schema migration task will eventually have Inter= nalResponseStage performing schema merge operation. Since this operation re= quires a compaction for each merge and is much slower to consume. Thus, the= back-pressure of incoming schema migration content objects consumes all of= the heap space and ultimately ends up OOM! -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org