Return-Path: X-Original-To: apmail-cloudstack-issues-archive@www.apache.org Delivered-To: apmail-cloudstack-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A254118AC6 for ; Thu, 10 Dec 2015 09:36:12 +0000 (UTC) Received: (qmail 44618 invoked by uid 500); 10 Dec 2015 09:36:12 -0000 Delivered-To: apmail-cloudstack-issues-archive@cloudstack.apache.org Received: (qmail 44562 invoked by uid 500); 10 Dec 2015 09:36:11 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 44550 invoked by uid 500); 10 Dec 2015 09:36:11 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 44543 invoked by uid 99); 10 Dec 2015 09:36:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Dec 2015 09:36:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 99FD92C14F7 for ; Thu, 10 Dec 2015 09:36:11 +0000 (UTC) Date: Thu, 10 Dec 2015 09:36:11 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CLOUDSTACK-9114) restartnetwork with cleanup should not update/restart both routers at once MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CLOUDSTACK-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050445#comment-15050445 ] ASF GitHub Bot commented on CLOUDSTACK-9114: -------------------------------------------- Github user ustcweizhou commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1198#discussion_r47204914 --- Diff: engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java --- @@ -2558,6 +2575,62 @@ public boolean restartNetwork(Long networkId, Account callerAccount, User caller } } + /* If there are redundant routers in the isolated network, we follow the steps to make the network working better + * (1) destroy backup router if exists + * (2) create new backup router + * (3) destroy master router (then the backup will become master) + * (4) create a new router as backup router. + */ + private boolean restartGuestNetworkWithRedundantRouters(NetworkVO network, List routers, ReservationContext context) throws ResourceUnavailableException, ConcurrentOperationException, InsufficientCapacityException { + Account caller = CallContext.current().getCallingAccount(); + long callerUserId = CallContext.current().getCallingUserId(); + + // check the master and backup redundant state + DomainRouterVO masterRouter = null; + DomainRouterVO backupRouter = null; + if (routers != null && routers.size() == 1) { + masterRouter = routers.get(0); + } if (routers != null && routers.size() == 2) { + DomainRouterVO router1 = routers.get(0); + DomainRouterVO router2 = routers.get(1); + if (router1.getRedundantState() == RedundantState.MASTER || router2.getRedundantState() == RedundantState.BACKUP) { + masterRouter = router1; + backupRouter = router2; + } else if (router1.getRedundantState() == RedundantState.BACKUP || router2.getRedundantState() == RedundantState.MASTER) { + masterRouter = router2; + backupRouter = router1; + } else { // both routers are in UNKNOWN state + masterRouter = router1; + backupRouter = router2; + } + } + + NetworkOfferingVO offering = _networkOfferingDao.findByIdIncludingRemoved(network.getNetworkOfferingId()); + DeployDestination dest = new DeployDestination(_dcDao.findById(network.getDataCenterId()), null, null, null); + List providersToImplement = getNetworkProviders(network.getId()); + + // destroy backup router + if (backupRouter != null) { + _routerService.destroyRouter(backupRouter.getId(), caller, callerUserId); + } + // create new backup router + implementNetworkElements(dest, context, network, offering, providersToImplement); + + // destroy master router + if (masterRouter != null) { + try { + Thread.sleep(10000); // wait 10s for the keepalived/conntrackd on backup router --- End diff -- @wilderrodrigues I made this change for 4.2.1 at first. I met some issues during the restartnetwork because of the keepalived/conntrackd, hence I added the sleep. 10 seconds is enough to wait the services to be running. I did not check the difference between keepalived on Debian 7.0.0 and Debian 7.9.0, so I thought the issue still exist. I am busy on other issues and have no time on writing the test, automated test and improvement are welcome. > restartnetwork with cleanup should not update/restart both routers at once > -------------------------------------------------------------------------- > > Key: CLOUDSTACK-9114 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9114 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the default.) > Reporter: Wei Zhou > Assignee: Wei Zhou > > for now, restartnetwork with cleanup will stop both RVRs at first, then start two new RVRs. > to reduce the downtime of network, we'd better restart the RVRs one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)