cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: [VOTE] Apache CloudStack 4.14.0.0 RC3
Date Wed, 20 May 2020 21:02:53 GMT
@gregor - the legacy should be fine with UEFI (what I had run on some of my
laptops); UEFI is not a problem, happens with 4.13 also, any VirtualBox OVA
file will cause the issue

###############################################
To conclude the ISSUE, based on my few hour testing today:

- happens when you deliberately use VirtualBox OVA template with vSphere
(who and why would do that, is another topic..), in ACS 4.13.x and
4.14/master

...out of which...:

- does NOT happen with vCenter 6.0 and 6.5 (confirmed by Daan/Bobby),
proper OVF parsing takes place and an error message is generated in ACS logs
- NOT tested:   6.7 / 6.7 U1xxx / 6.7 U2xxx (i.e. not tested with any
variant < 6.7 U3)
- issues happen with vCenter 6.7 U3  / U3a / U3b / U3f - these were
explicitly tested by me and some vCenter services would crash (though still
appearing as running) - the problem is solved by restarting (most?)
services - namely "VMware afd Service" will trigger other services to
restart (dependency) and after a while vCenter is UP again (I could not
find which exact service (single one) might be the issue)
- Worth mentioning this was observed on vCenter on Windows Server, not the
VCSA appliance

-  seems FINE - NO ISSUES with vCenter 6.7 U3g (the latest 6.7 U3 variants
at the moment - build 16046470 from 28.04.2020) and the VM deployment fails
gracefully with a proper error message of not being able to create SPEC
file based on the (bad) OVF.
################################################

Since the issue is solved in the (current) latest vSphere 6.7 U3g variant,
I will make sure to have the proper warning message on both 4.13.1 and
4.14.0.0 Release notes documentation (4.13 is when we started supporting
vSphere 6.7 and the same issue present here)

I'll proceed tomorrow with releasing 4.14 based on the voting done so far.

Thanks

On Wed, 20 May 2020 at 22:09, Marcus <shadowsor@gmail.com> wrote:

> I would say, if it is proven that this happens with existing released
> CloudStack versions, with or without the UEFI feature, against a specific
> VMware release with a specific broken template, then it becomes an
> environment issue and shouldn't block the release.  In this case it would
> not matter if we tried to revert the feature, or if we did or did not
> release 4.14, the users who would hit this would be hitting this now in
> live environments, with the released versions of CloudStack.
>
> To be clear, I'm not 100% certain this is exactly what Bobby was saying,
> but if this is the case then I think it should not block us.
>
> On Wed, May 20, 2020 at 1:00 AM Riepl, Gregor (SWISS TXT) <
> Gregor.Riepl@swisstxt.ch> wrote:
>
> > Hi everyone
> >
> > Sorry for the late response, but I have a few concerns:
> >
> >
> >   *   As Bobby stated, this bug seems to only occur with VMware 6.7+, and
> > it sounds to me like they should take action on it. Does someone track
> this
> > with VMware?
> >   *   Do I understand correctly that the issue only occurs when the image
> > is set to UEFI mode, but the VM is configured as Legacy Boot in
> CloudStack?
> > How would this combination even work? I think CloudStack should either
> > reject such a mismatch or autocorrect it. Or at least display a warning
> to
> > the user.
> >   *   If the bug can break vCenter (if only temporarily), there should
> > definitely some sort of safeguard around it, even if it isn't a proper
> fix
> > or workaround.
> >
> > Regards,
> > Gregor
> > ________________________________
> > From: Andrija Panic <andrija.panic@gmail.com>
> > Sent: 19 May 2020 21:11
> > To: users <users@cloudstack.apache.org>
> > Cc: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
> > Subject: Re: [VOTE] Apache CloudStack 4.14.0.0 RC3
> >
> > Hi all,
> >
> > In my humble opinion, we should release 4.14 as it is (considering we
> have
> > enough votes), but we'll further investigate the actual/behind-the-scene
> > root-cause for the vSphere 6.7 harakiri (considering 6.0 and 6.5 are not
> > affected) - this is possibly a VMware bug and we'll certainly try to
> > address it.
> >
> > If I don't hear any more concerns or -1 votes until tomorrow morning CET
> > time, I will proceed with concluding the voting process and crafting the
> > release.
> >
> > Thanks,
> > Andrija
> >
> > On Tue, 19 May 2020 at 19:23, Pavan Kumar Aravapalli <
> > pavankumar_a@accelerite.com> wrote:
> >
> > > Thank you Bobby and Daan for the update. However I have not encountered
> > > such issue while doing dev test with Vmware 5.5 & 6.5.
> > >
> > >
> > >
> > >
> > >
> > > Regards,
> > >
> > > Pavan Aravapalli.
> > >
> > >
> > > ________________________________
> > > From: Daan Hoogland <daan.hoogland@gmail.com>
> > > Sent: 19 May 2020 20:56
> > > To: users <users@cloudstack.apache.org>
> > > Cc: dev@cloudstack.apache.org <dev@cloudstack.apache.org>
> > > Subject: Re: [VOTE] Apache CloudStack 4.14.0.0 RC3
> > >
> > > Thanks Bobby,
> > > All, I've been closely working with Bobby and seen the same things.
> Does
> > > anybody see any issues releasing 4.14 based on this code? I can confirm
> > > that it is not Pavernalli's UEFI PR and we should not create a new PR
> to
> > > revert it.
> > > thanks for all of your patience,
> > >
> > > (this is me giving a binding +1)
> > >
> > >
> > > On Tue, May 19, 2020 at 5:04 PM Boris Stoyanov <
> > > boris.stoyanov@shapeblue.com>
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I've done more testing around this and I can now confirm it has
> nothing
> > > to
> > > > do with cloudstack code.
> > > >
> > > > I've tested it with rc3, reverted UEFI PR and 4.13.1 (which does not
> > > > happen to have the feature at all). Also I've used a matrix of VMware
> > > > version of 6.0u2, 6.5u2 and 6.7u3.
> > > >
> > > > The bug is reproducible with all the cloudstack versions, and only
> > vmware
> > > > 6.7u3, I was not able to reproduce this with 6.5/6.0. All of my
> results
> > > > during testing show it must be related to that specific version of
> > > VMware.
> > > >
> > > > Therefore I'm reversing my '-1' and giving a +1 vote on the RC. I
> think
> > > it
> > > > needs to be included in release notes to refrain from that version
> for
> > > now
> > > > until further investigation is done.
> > > >
> > > > Thanks,
> > > > Bobby.
> > > >
> > > > On 19.05.20, 10:08, "Boris Stoyanov" <boris.stoyanov@shapeblue.com>
> > > > wrote:
> > > >
> > > >     Indeed it is severe, but please note it's a corner case which was
> > > > unearthed almost by accident. It falls down to using a new feature of
> > > > selecting a boot protocol and the template must be corrupted. So with
> > > > already existing templates I would not expect to encounter it.
> > > >
> > > >     As for recovery, we've managed to recover vCenter and Cloudstack
> > > after
> > > > reboots of the vCenter machine and the Cloudstack management service.
> > > > There's no exact points to recover for now, but restart seems to
> work.
> > > >     By graceful failure I mean, cloudstack erroring out the
> deployment
> > > and
> > > > VM finished in ERROR state, meanwhile connection and operability with
> > > > vCenter cluster remains the same.
> > > >
> > > >     We're currently exploring options to fix this, one could be to
> > > disable
> > > > the feature for VMWare and work to introduce more sustainable fix in
> > next
> > > > release. Other is to look for more guarding code when installing a
> > > > template, since VMware doesn’t actually allow you install that
> > particular
> > > > template but cloudstack does. We'll keep you posted.
> > > >
> > > >     Thanks,
> > > >     Bobby.
> > > >
> > > >     On 18.05.20, 23:01, "Marcus" <shadowsor@gmail.com> wrote:
> > > >
> > > >         The issue sounds severe enough that a release note probably
> > won't
> > > > suffice -
> > > >         unless there's a documented way to recover we'd never want to
> > > > leave a
> > > >         system susceptible to being unrecoverable, even if it's
> rarely
> > > > triggered.
> > > >
> > > >         What's involved in "failing gracefully"? Is this a small fix,
> > or
> > > an
> > > >         overhaul?  Perhaps the new feature could be disabled for
> > VMware,
> > > or
> > > >         disabled altogether until a fix is made in a patch release.
> > > >
> > > >         Does it only affect new templates, or is there a risk that an
> > > > existing
> > > >         template out in vSphere could suddenly cause problems?
> > > >
> > > >         On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> > > >         boris.stoyanov@shapeblue.com> wrote:
> > > >
> > > >         > Hi guys,
> > > >         >
> > > >         > A little further info on this, it appears when we use a
> > > > corrupted template
> > > >         > and UEFI/Legacy mode when deploy a VM, it breaks the
> > connection
> > > > between
> > > >         > cloudstack and vCenter.
> > > >         >
> > > >         > All hosts become unreachable and basically the cluster is
> not
> > > > functional,
> > > >         > have not investigated a way to recover this but seems like
> a
> > > > huge mess..
> > > >         > Please note that user is not able to register such template
> > in
> > > > vCenter
> > > >         > directly, but cloudstack allows using it.
> > > >         >
> > > >         > Open to discuss if we'll fix this, since it's expected
> users
> > to
> > > > use
> > > >         > working templates, I think we should be failing gracefully
> > and
> > > > such action
> > > >         > should not be able to create downtime on such a large
> scale.
> > > >         >
> > > >         > I believe the boot type feature is new one and it's not
> > > > available in older
> > > >         > releases, so this issue should be limited to 4.14/current
> > > master.
> > > >         >
> > > >         > Thanks,
> > > >         > Bobby.
> > > >         >
> > > >         > On 15.05.20, 17:07, "Boris Stoyanov" <
> > > > boris.stoyanov@shapeblue.com>
> > > >         > wrote:
> > > >         >
> > > >         >     I'll have to -1 RC3, we've discovered details about an
> > > issue
> > > > which is
> > > >         > causing severe consequences with a particular hypervisor
in
> > the
> > > > afternoon.
> > > >         > We'll need more time to investigate before disclosing.
> > > >         >
> > > >         >     Bobby.
> > > >         >
> > > >         >     On 15.05.20, 9:12, "Boris Stoyanov" <
> > > > boris.stoyanov@shapeblue.com>
> > > >         > wrote:
> > > >         >
> > > >         >         +1 (binding)
> > > >         >
> > > >         >         I've executed upgrade tests with the following
> > > > configurations:
> > > >         >
> > > >         >         4.13.1 with KVM on CentOS7 hosts
> > > >         >         4.13 with VMware6.5 hosts
> > > >         >         4.11.3 with KVM on CentOS7 hosts
> > > >         >         4.11.2 with XenServer7 hosts
> > > >         >         4.11.1 with VMware 6.7
> > > >         >         4.9.3 with XenServer 7 hosts
> > > >         >         4.9.2 with KVM on CentOS 7 hosts
> > > >         >
> > > >         >         Also I've run basic lifecycle operations on the
> > > following
> > > >         > components:
> > > >         >         VMs
> > > >         >         Volumes
> > > >         >         Infra (zones, pod, clusters, hosts)
> > > >         >         Networks
> > > >         >         and more
> > > >         >
> > > >         >         I did not come across any problems during this
> > testing.
> > > >         >
> > > >         >         Thanks,
> > > >         >         Bobby.
> > > >         >
> > > >         >
> > > >         >         On 11.05.20, 18:21, "Andrija Panic" <
> > > > andrija.panic@gmail.com>
> > > >         > wrote:
> > > >         >
> > > >         >             Hi All,
> > > >         >
> > > >         >             I've created a 4.14.0.0 release (RC3), with the
> > > > following
> > > >         > artefacts up for
> > > >         >             testing and a vote:
> > > >         >
> > > >         >             Git Branch and Commit SH:
> > > >         >
> > > >         >
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.14.0.0-RC20200511T1503
> > > >         >             Commit:
> 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e
> > > >         >
> > > >         >             Source release (checksums and signatures are
> > > > available at the
> > > >         > same
> > > >         >             location):
> > > >         >
> > > > https://dist.apache.org/repos/dist/dev/cloudstack/4.14.0.0/
> > > >         >
> > > >         >             PGP release keys (signed using 3DC01AE8):
> > > >         >
> > > > https://dist.apache.org/repos/dist/release/cloudstack/KEYS
> > > >         >
> > > >         >             The vote will be open until 14th May 2020,
> 17.00
> > > CET
> > > > (72h).
> > > >         >
> > > >         >             For sanity in tallying the vote, can PMC
> members
> > > > please be
> > > >         > sure to indicate
> > > >         >             "(binding)" with their vote?
> > > >         >
> > > >         >             [ ] +1 approve
> > > >         >             [ ] +0 no opinion
> > > >         >             [ ] -1 disapprove (and reason why)
> > > >         >
> > > >         >             Additional information:
> > > >         >
> > > >         >             For users' convenience, I've built packages
> from
> > > >         >             6f96b3b2b391a9b7d085f76bcafa3989d9832b4e and
> > > > published RC3
> > > >         > repository here:
> > > >         >
> http://packages.shapeblue.com/testing/41400rc3/
> > > > (CentOS 7 and
> > > >         >             Debian/generic, both with noredist support)
> > > >         >             and here
> > > >         >
> > > >         >
> > > >
> > >
> >
> https://download.cloudstack.org/testing/4.14.0.0-RC20200506T2028/ubuntu/bionic/
> > > >         >              (Ubuntu 18.04 specific, no noredist support
-
> > > > thanks to
> > > >         > Gabriel):
> > > >         >
> > > >         >             The release notes are still work-in-progress,
> but
> > > > for the
> > > >         > upgrade
> > > >         >             instructions (including the new systemVM
> > templates)
> > > > you may
> > > >         > refer to the
> > > >         >             following URL:
> > > >         >
> > > >         >
> > > >
> > >
> >
> https://acs-www.shapeblue.com/docs/WIP-PROOFING/pr112/upgrading/index.html
> > > >         >
> > > >         >             4.14.0.0 systemVM templates are available from
> > > here:
> > > >         >             http://download.cloudstack.org/systemvm/4.14/
> > > >         >
> > > >         >             NOTES on issues fixed in this RC3 release:
> > > >         >
> > > >         >             (this one does *NOT* require a full retest if
> you
> > > > were testing
> > > >         > RC1/RC2
> > > >         >             already - just if you were affected this
> issue):
> > > >         >             -
> https://github.com/apache/cloudstack/pull/4064
> > -
> > > > affects
> > > >         > hostnames when
> > > >         >             attaching a VM to additional networks
> > > >         >
> > > >         >             Regards,
> > > >         >
> > > >         >
> > > >         >             Andrija Panić
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >     boris.stoyanov@shapeblue.com
> > > >         >     www.shapeblue.com<http://www.shapeblue.com>
> > > >         >     3 London Bridge Street,  3rd floor, News Building,
> London
> > > > SE1 9SGUK
> > > >         >     @shapeblue
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         > boris.stoyanov@shapeblue.com
> > > >         > www.shapeblue.com<http://www.shapeblue.com>
> > > >         > 3 London Bridge Street,  3rd floor, News Building, London
> > SE1
> > > > 9SGUK
> > > >         > @shapeblue
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >
> > > >
> > > >
> > > >     boris.stoyanov@shapeblue.com
> > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > >     3 London Bridge Street,  3rd floor, News Building, London  SE1
> > 9SGUK
> > > >     @shapeblue
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > boris.stoyanov@shapeblue.com
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > 3 London Bridge Street,  3rd floor, News Building, London  SE1 9SGUK
> > > > @shapeblue
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Daan
> > >
> >
> >
> > --
> >
> > Andrija Panić
> >
>


-- 

Andrija Panić

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message