cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daan Hoogland <>
Subject [PROPOSAL] reducing VR downtime on upgrade
Date Thu, 15 Feb 2018 15:36:14 GMT
The intention of this proposal is to have a way forward to reducing maintenance downtime for
virtual routers. There are two parts to this proposal;

  1.  Dealing with legacy routers and replacing them before shutting down.
  2.  Unifying router embodiments and making use of redundancy mechanisms to quickly failover
from old to new.

Ad .1 It will always be possible that a router is to old and will not be able to talk to a
new version that is to replace it. This might be due to a keepalived update or replacement
or just because it is very old. So though Unifying the routers and making them redundant enabled
will solve a lot of use cases it will never deal with any conceivable situation, not even
in systems upgraded to a version in which all intended functionality has been implemented.
Dealing with any older router is to work as follows:

  1.  A check will be done to make sure the old VR is still up.
     *   If it is not there is no consideration it will be replaced as quickly as possible.
Possible improvements here are the iptables configuration speedup and other generic optimisations
unrelated to the upgrade itself.
     *   If it is there we need to walk on eggs with provisioning the new one😉
  2.  A new VR will be instantiated
  3.  Configuration data will be send but not applied.
  4.  The interfaces will be added and if need be brought down.
  5.  All configuration is applied
  6.  The old VR is killed
  7.  The interface on the new VR are brought up

Ad .2 This is a long-term goal. At the moment we have five (or debatably six) different incarnations
of the virtual router:

  *   Basic zone dhcp server
  *   Shared network ‘router’
  *   VR
  *   rVR
  *   VPC
  *   rVPC
a first set of steps will be to reduce this to

  *   shared networks (where a basic zone is an automatic implementation of a single shared
network in a zone)
  *   VR (which is always redundant enabled but may have only one instance)
  *   VPC (as above)
and then the next step is to unify VR and VPC as a VR is really only a VPC with just one network
the final step is then to unify a shared network with a VPC and this one is so far ahead that
I don’t want to make too much statements about it now. We will have to find the exact implementation
hazards that we will face in this step along the way. I think we are talking at least one
year in when we reach this point.

As Shapeblue we will be starting a short PoC on the first part. We will try to figure out
if the process under .1 is feasible, or that we need to wait configuring interfaces to the
last moment and then do a ‘blind’ start.
53 Chandos Place, Covent Garden, London  WC2N 4HSUK

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message