cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshul Gangwar <anshul.gang...@citrix.com>
Subject Re: [DISCUSS][PROPOSAL]missing power state reports from hypervisors on VMs ([BLOCKER]?)
Date Wed, 16 Sep 2015 10:25:32 GMT
I don’t think there was any discussion around this. Kelven have made fixes around VMSync.
So to find details look into FS https://cwiki.apache.org/confluence/display/CLOUDSTACK/FS+-+VMSync+improvement
.

Regards,
Anshul

On 16-Sep-2015, at 3:32 PM, Daan Hoogland <daan.hoogland@gmail.com<mailto:daan.hoogland@gmail.com>>
wrote:

On Wed, Sep 16, 2015 at 11:46 AM, Anshul Gangwar <anshul.gangwar@citrix.com<mailto:anshul.gangwar@citrix.com>>
wrote:

It’s not difficult to find a good grace period. It will simply depend on
your Hypervisor settings how it is configured for HA. You can easily figure
out for how much time there will be no VM on any Host from your settings
and simply put 2-3 times of that period as grace period.

​That seems kludgey.
​


It seems you have considered only one aspect of change i.e. User VMs HA.
Did you consider System VMs HA?
Did you consider that we have already explored that territory of separate
handling of PowerOff and PowerReportMissing?

​for VMware or for all hypervisors? Do you have a link to the discussion?
These states are different.​

​Why was it decided to treat them the same?
​

And even if you are still thinking of this change then add marvin tests
for this change. Unit tests will not tell anything about the change.

​Yes, that I definitely agree on.​



Regards,
Anshul

On 16-Sep-2015, at 2:48 PM, Rene Moser <mail@renemoser.net<mailto:mail@renemoser.net>>
wrote:


Hi René

On 09/16/2015 10:17 AM, Anshul Gangwar wrote:
Currently we report only PowerOn VMs and do not report PowerOff VMs
that's why we consider Missing and PowerOff as same And that's how most of
the code is written for VM sync and each Hypervisor resource has same
understanding. This will effect HA and many more unknown places. So please
do not even consider to merge this change.

So Now coming to bug we can fix that by changing global setting
pingInterval to appropriate value according to hypervisor settings which
takes care of these transitional period of missing report here or can be
handled by introducing gracePeriod global setting.

This is interesting, I also wrote in the bug report gracePeriod
calculation might be related.

https://github.com/apache/cloudstack/blob/4.5.2/engine/orchestration/src/com/cloud/vm/VirtualMachinePowerStateSyncImpl.java#L110
.

IMHO making this value configurable would might solve it, but it is hard
to "guess" what a good grace period would be.

In terms of VMware it depends on amounts of esx in the clusters, and
they can be different.

But another question is, why make one _global_ grace period for every
hypervisor. Think about, users can have mixed hypervisors setups.

So to me, a global grace period setting might not be the best solution,
instead we should take care hypervisor functionality, in this case
VMware, it handels HA by itself.

I know a VR in 4.5 would be broken after an VMware HA event, but there
is another global setting, which can be enabled if you like for out of
band migrations router restarts.

So to me, in 4.5 I am +1 for the patch of daan makes sense, if
hypervisor is VMware.

Yours
René





--
Daan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message