cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "edison su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-5582) kvm - HA is not triggered when host is powered down since the host gets into "Disconnected" state.
Date Fri, 20 Dec 2013 19:08:10 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854433#comment-13854433
] 

edison su commented on CLOUDSTACK-5582:
---------------------------------------

The ha manager has bug which introduced by Alex's commit: 5297a071d2c20040878950172b8d0211ac7cb436

HaManagerImpl->scheduleRestart, if investigate is passed as "false", which is the case
when kvm agent connecting back to mgt server, the code will stop the vm, but didn't reload
the vm object, so this line of code:
HaWorkVO work = new HaWorkVO(vm.getId(), vm.getType(), WorkType.HA, investigate ? Step.Investigating
: Step.Scheduled, hostId, vm.getState(), maxRetries + 1, vm.getUpdated());
 will store vm state as running in haworkvo.

Then this line of code will be reached:



 s_logger.info("HA on " + vm);
        if (vm.getState() != work.getPreviousState() || vm.getUpdated() != work.getUpdateTime())
{
            s_logger.info("VM " + vm + " has been changed.  Current State = " + vm.getState()
+ " Previous State = " + work.getPreviousState() + " last updated = " + vm.getUpdated()
                    + " previous updated = " + work.getUpdateTime());
            return null;
        }

Then HA won't be triggered. 

The fix will be reload vm state, in scheduleRestart, after _itMgr.advanceStop(vm.getUuid(),
true); is called.



> kvm - HA is not triggered when host is powered down since the host gets into "Disconnected"
state. 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5582
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5582
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: edison su
>            Priority: Critical
>             Fix For: 4.3.0
>
>
> kvm - HA is not triggered when host is powered down since the host gets into "Disconnected"
state.
> Advanced zone with  2 KVM (RHEL 6.3) hosts.
> Steps to reproduce the problem:
> Deploy few Vms in each of the hosts .
> Power down one of the hosts ( using IPMI).
> We see that the host gets into "Disconnected" state.
> All the Vms that are running in this host continue to be in "Up" state.
> This happens because of management server receiving a explicit shutdown request from
the agent:
> 2013-12-19 21:06:37,262 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-15:null)
SeqA 2--1: Processing Seq 2--1:  { Cmd , MgmtId: -1, via: 2, Ver: v1, Flags: 111, [{"com.cloud.agent.api.ShutdownCommand":{"reason":"sig.kill","wait":0}}]
}
> 2013-12-19 21:06:37,263 INFO  [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-15:null)
Host 2 has informed us that it is shutting down with reason sig.kill and detail null
> 2013-12-19 21:06:37,263 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-a32ed8e2)
Host 2 is disconnecting with event ShutdownRequested
> 2013-12-19 21:06:37,264 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-a32ed8e2)
The next status of agent 2is Disconnected, current status is Up
> 2013-12-19 21:06:37,264 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-a32ed8e2)
Deregistering link for 2 with state Disconnected
> 2013-12-19 21:06:37,264 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-a32ed8e2)
Remove Agent : 2
> 2013-12-19 21:06:37,264 DEBUG [c.c.a.m.ConnectedAgentAttache] (AgentTaskPool-1:ctx-a32ed8e2)
Processing Disconnect.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message