Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cloudstack.apache.org
Date: Thu, 10 Dec 2015 09:36:11 +0000 (UTC)
From: "ASF GitHub Bot (JIRA)" <jira@apache.org>
To: cloudstack-issues@incubator.apache.org
Message-ID: <JIRA.12919653.1449485943000.323601.1449740171628@Atlassian.JIRA>
In-Reply-To: <JIRA.12919653.1449485943000@Atlassian.JIRA>
References: <JIRA.12919653.1449485943000@Atlassian.JIRA>
 <JIRA.12919653.1449485943101@arcas>
Subject: [jira] [Commented] (CLOUDSTACK-9114) restartnetwork with cleanup
 should not update/restart both routers at once
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050445#comment-15050445 ] 

ASF GitHub Bot commented on CLOUDSTACK-9114:
--------------------------------------------

Github user ustcweizhou commented on a diff in the pull request:

    https://github.com/apache/cloudstack/pull/1198#discussion_r47204914
  
    --- Diff: engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java ---
    @@ -2558,6 +2575,62 @@ public boolean restartNetwork(Long networkId, Account callerAccount, User caller
             }
         }
     
    +    /* If there are redundant routers in the isolated network, we follow the steps to make the network working better
    +     *  (1) destroy backup router if exists
    +     *  (2) create new backup router
    +     *  (3) destroy master router (then the backup will become master)
    +     *  (4) create a new router as backup router.
    +     */
    +    private boolean restartGuestNetworkWithRedundantRouters(NetworkVO network, List<DomainRouterVO> routers, ReservationContext context) throws ResourceUnavailableException, ConcurrentOperationException, InsufficientCapacityException {
    +        Account caller = CallContext.current().getCallingAccount();
    +        long callerUserId = CallContext.current().getCallingUserId();
    +
    +        // check the master and backup redundant state
    +        DomainRouterVO masterRouter = null;
    +        DomainRouterVO backupRouter = null;
    +        if (routers != null && routers.size() == 1) {
    +            masterRouter = routers.get(0);
    +        } if (routers != null && routers.size() == 2) {
    +            DomainRouterVO router1 = routers.get(0);
    +            DomainRouterVO router2 = routers.get(1);
    +            if (router1.getRedundantState() == RedundantState.MASTER || router2.getRedundantState() == RedundantState.BACKUP) {
    +                masterRouter = router1;
    +                backupRouter = router2;
    +            } else if (router1.getRedundantState() == RedundantState.BACKUP || router2.getRedundantState() == RedundantState.MASTER) {
    +                masterRouter = router2;
    +                backupRouter = router1;
    +            } else { // both routers are in UNKNOWN state
    +                masterRouter = router1;
    +                backupRouter = router2;
    +            }
    +        }
    +
    +        NetworkOfferingVO offering = _networkOfferingDao.findByIdIncludingRemoved(network.getNetworkOfferingId());
    +        DeployDestination dest = new DeployDestination(_dcDao.findById(network.getDataCenterId()), null, null, null);
    +        List<Provider> providersToImplement = getNetworkProviders(network.getId());
    +
    +        // destroy backup router
    +        if (backupRouter != null) {
    +            _routerService.destroyRouter(backupRouter.getId(), caller, callerUserId);
    +        }
    +        // create new backup router
    +        implementNetworkElements(dest, context, network, offering, providersToImplement);
    +
    +        // destroy master router
    +        if (masterRouter != null) {
    +            try {
    +                Thread.sleep(10000); // wait 10s for the keepalived/conntrackd on backup router
    --- End diff --
    
    @wilderrodrigues 
    I made this change for 4.2.1 at first. I met some issues during the restartnetwork because of the keepalived/conntrackd, hence I added the sleep. 10 seconds is enough to wait the services to be running.
    I did not check the difference between keepalived on Debian 7.0.0 and Debian 7.9.0, so I thought the issue still exist.
    
    I am busy on other issues and have no time on writing the test, automated test and improvement are welcome.
    

> restartnetwork with cleanup should not update/restart both routers at once
> --------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9114
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9114
>             Project: CloudStack
>          Issue Type: Improvement
>      Security Level: Public(Anyone can view this level - this is the default.) 
>            Reporter: Wei Zhou
>            Assignee: Wei Zhou
>
> for now, restartnetwork with cleanup will stop both RVRs at first, then start two  new RVRs.
> to reduce the downtime of network, we'd better restart the RVRs one by one.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)