From dev-return-60844-archive-asf-public=cust-asf.ponee.io@storm.apache.org Thu Oct 3 14:02:45 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7928118065B for ; Thu, 3 Oct 2019 16:02:45 +0200 (CEST) Received: (qmail 60929 invoked by uid 500); 3 Oct 2019 14:02:44 -0000 Mailing-List: contact dev-help@storm.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@storm.apache.org Delivered-To: mailing list dev@storm.apache.org Received: (qmail 60918 invoked by uid 99); 3 Oct 2019 14:02:44 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2019 14:02:44 +0000 From: GitBox To: dev@storm.apache.org Subject: [GitHub] [storm] Ethanlm commented on a change in pull request #3133: STORM-3516 Kill or Rebalance Topology not processed on Nimbus restart Message-ID: <157011136449.14361.3168996787359787114.gitbox@gitbox.apache.org> Date: Thu, 03 Oct 2019 14:02:44 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Ethanlm commented on a change in pull request #3133: STORM-3516 Kill or Rebalance Topology not processed on Nimbus restart URL: https://github.com/apache/storm/pull/3133#discussion_r331056143 ########## File path: storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java ########## @@ -1328,12 +1329,24 @@ public void launchServer() throws Exception { exec.prepare(); } - if (isLeader()) { - for (String topoId : state.activeStorms()) { - transition(topoId, TopologyActions.STARTUP, null); - } - clusterMetricSet.setActive(true); - } + // Leadership coordination may be incomplete when launchServer is called. Previous behavior did a one time check + // which could cause Nimbus to not process TopologyActions.STARTUP transitions. Similar problem exists for + // HA Nimbus on being newly elected as leader. Change to a recurring pattern addresses these problems. + timer.scheduleRecurring(3, 5, + () -> { + try { + boolean isLeader = isLeader(); + if (isLeader && !wasLeader) { + for (String topoId : state.activeStorms()) { + transition(topoId, TopologyActions.STARTUP, null); Review comment: It might be worth to looking into changing `STARTUP` to some text referring to gaining leadership, if it's feasible/make sense to do so. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org With regards, Apache Git Services