cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Weingärtner <rafaelweingart...@gmail.com>
Subject Re: [DISCUSS] VR upgrade downtime reduction
Date Wed, 07 Feb 2018 10:02:01 GMT
 ONE-VR approach in ACS 5.0. It is time to plan for a major release and
break some things...

On Wed, Feb 7, 2018 at 7:17 AM, Paul Angus <paul.angus@shapeblue.com> wrote:

> It seems sensible to me to have ONE VR, and I like the idea of that we all
> VRs are 'redundant-ready', again supporting the ONE-VR approach.
>
> The question I have is:
>
> - how do we handle the transition - does it need ACS 5.0?
> The API and the UI separate the VR and the VPC, so what is the most
> logical presentation of the proposed solution to the users/operators.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Daan Hoogland [mailto:daan.hoogland@gmail.com]
> Sent: 07 February 2018 08:58
> To: dev <dev@cloudstack.apache.org>
> Subject: Re: [DISCUSS] VR upgrade downtime reduction
>
> Reading all the reactions I am getting wary of all the possible solutions
> that we have.
>  We do have a fragile VR and Remi's way seems the only one to stabilise it.
> It also answers the question on which of my two tactics we should follow.
>  Wido's abjection may be valid but services that are not started are not
> crashing and thus should not hinder him.
>  As for Wei's changes I think the most important one is in the PR I ported
> forward to master, using his older commit. I metntioned it in
> >     ​[1] https://github.com/apache/cloudstack/pull/2435​
> I am looking forward to any of your PRs as well Wei.
>
>  Making all VRs redundant is a bit of a hack and the biggest risk in it is
> making sure that only one will get started.
>
> ​ There is one point I'd like consensus on; We have only one system
> template and we are well served by letting it have only one form as VR. ​Do
> we agree on that?
>
> ​comments, flames, questions, ​regards,​
>
>
> On Tue, Feb 6, 2018 at 9:04 PM, Wei ZHOU <ustcweizhou@gmail.com> wrote:
>
> > Hi Remi,
> >
> > Actually in our fork, there are more changes than restartnetwork and
> > restart vpc, similar as your changes.
> > (1) edit networks from offering with single VR to offerings with RVR,
> > will hack VR (set new guest IP, start keepalived and conntrackd,
> > blablabla)
> > (2) restart vpc from single VR to RVR. similar changes will be made.
> > The downtime is around 5s. However, these changes are based 4.7.1, we
> > are not sure if it still work in 4.11
> >
> > We have lots of changes , we will port the changes to 4.11 LTS and
> > create PRs in the next months.
> >
> > -Wei
> >
> >
> > 2018-02-06 14:47 GMT+01:00 Remi Bergsma <RBergsma@schubergphilis.com>:
> >
> > > Hi Daan,
> > >
> > > In my opinion the biggest issue is the fact that there are a lot of
> > > different code paths: VPC versus non-VPC, VPC versus redundant-VPC,
> etc.
> > > That's why you cannot simply switch from a single VPC to a redundant
> > > VPC for example.
> > >
> > > For SBP, we mitigated that in Cosmic by converting all non-VPCs to a
> > > VPC with a single tier and made sure all features are supported.
> > > Next we
> > merged
> > > the single and redundant VPC code paths. The idea here is that
> > > redundancy or not should only be a difference in the number of
> > > routers. Code should
> > be
> > > the same. A single router, is also "master" but there just is no
> > "backup".
> > >
> > > That simplifies things A LOT, as keepalived is now the master of the
> > whole
> > > thing. No more assigning ip addresses in Python, but leave that to
> > > keepalived instead. Lots of code deleted. Easier to maintain, way
> > > more stable. We just released Cosmic 6 that has this feature and are
> > > now
> > rolling
> > > it out in production. Looking good so far. This change unlocks a lot
> > > of possibilities, like live upgrading from a single VPC to a
> > > redundant one (and back). In the end, if the redundant VPC is rock
> > > solid, you most
> > likely
> > > don't even want single VPCs any more. But that will come.
> > >
> > > As I said, we're rolling this out as we speak. In a few weeks when
> > > everything is upgraded I can share what we learned and how well it
> works.
> > > CloudStack could use a similar approach.
> > >
> > > Kind Regards,
> > > Remi
> > >
> > >
> > >
> > > On 05/02/2018, 16:44, "Daan Hoogland" <daan.hoogland@gmail.com>
> wrote:
> > >
> > >     H devs,
> > >
> > >     I have recently (re-)submitted two PRs, one by Wei [1] and one
> > > by
> > Remi
> > > [2],
> > >     that reduce downtime for redundant routers and redundant VPCs
> > > respectively.
> > >     (please review those)
> > >     Now from customers we hear that they also want to reduce downtime
> for
> > >     regular VRs so as we discussed this we came to two possible
> > > solutions that
> > >     we want to implement one of:
> > >
> > >     1. start and configure a new router before destroying the old
> > > one and then
> > >     as a last minute action stop the old one.
> > >     2. make all routers start up redundancy services but for regular
> > > routers
> > >     start only one until an upgrade is required at which time a new,
> > second
> > >     router can be started before killing the old one.​
> > >
> > >     ​obviously both solutions have their merits, so I want to have
> > > your input
> > >     to make the broadest supported implementation.
> > >     -1 means there will be an overlap or a small delay and
> > > interruption
> > of
> > >     service.
> > >     +1 It can be argued, "they got what they payed for".
> > >     -2 means a overhead in memory usage by the router by the extra
> > services
> > >     running on it.
> > >     +2 the number of router-varieties will be further reduced.
> > >
> > >     -1&-2 We have to deal with potentially large upgrade steps from
> > > way before
> > >     the cloudstack era even and might be stuck to 1 because of that,
> > > needing to
> > >     hack around it. Any dealing with older VRs, pre 4.5 and
> > > especially
> > pre
> > > 4.0
> > >     will be hard.
> > >
> > >     I am not cross posting though this might be one of these
> > > occasions where it
> > >     is appropriate to include users@. Just my puristic inhibitions.
> > >
> > >     Of course I have preferences but can you share your thoughts,
> please?
> > >     ​
> > >     ​And don't forget to review Wei's [1] and Remi's [2] work please.
> > >
> > >     ​[1] https://github.com/apache/cloudstack/pull/2435​
> > >     [2] https://github.com/apache/cloudstack/pull/2436
> > >
> > >     --
> > >     Daan
> > >
> > >
> > >
> >
>
>
>
> --
> Daan
>



-- 
Rafael Weingärtner

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message