cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjeev N (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-5610) [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment fails
Date Mon, 23 Dec 2013 13:29:53 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855631#comment-13855631
] 

Sanjeev N commented on CLOUDSTACK-5610:
---------------------------------------

Similar behavior has been observed in case of network disconnect.

> [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment
fails
> -------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5610
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5610
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Hypervisor Controller, Management Server
>    Affects Versions: 4.3.0
>         Environment: Latest build from 4.3 with commit :d462db4ae5c30e677d5810111f9ea5ca6812bce2
> Storage: SMB for both primary and secondary
> Hypervisor: Hyper-v
>            Reporter: Sanjeev N
>            Priority: Blocker
>              Labels: hyper-V,
>             Fix For: 4.3.0
>
>         Attachments: cloud.dmp, management-server.rar
>
>
> [Hyper-v] Host does not go into Alert state even though it is power-off hence vm deployment
fails
> Steps to Reproduce:
> =================
> 1.Bring up CS in advanced zone with with 2 or more Hyper-v hosts using SMB for both primary
and secondary
> 2.Enable the zone and deploy few vms. Make sure that vms are distributed across all the
hosts
> 3.Power off one of the hosts(Power off the hosts where vms are running)
> Expected Result:
> ==============
> Host should go into Alert state and all the vms running on it should be stopped
> Actual Result:
> ============
> Host remains in Up state and all the vms state show as running.
> I could see the ping commands to Hypervsior aget, system vm agents in the MS log. Even
though the agents are behind ping, agent status remains in UP state.
> At this state , I have tried to deploy a vm and deployment planner chose the host which
was powered off . Hence the vm deployment failed.
> Also CPVM was running on the powered off host. That also remained in running state. Since
cpvm agent is not reachable from CS it should have been stopped and started on another Host
in the cluster.
> 2013-12-23 18:19:25,334 ERROR [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-331:ctx-831c60e9)
org.apache.http.conn.HttpHostConnectException: Connection to http://10.147.40.31:8250 refused
> 2013-12-23 18:19:25,334 INFO  [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-331:ctx-831c60e9)
Cannot ping host 10.147.40.31 (IP 10.147.40.31), pingAns (blank means null) is:com.cloud.agent.api.UnsupportedAnswer
> 2013-12-23 18:19:25,334 WARN  [c.c.a.m.DirectAgentAttache] (DirectAgent-331:ctx-831c60e9)
Unable to get current status on 5(10.147.40.31)
> 2013-12-23 18:19:25,336 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
Investigating why host 5 has disconnected with event AgentDisconnected
> 2013-12-23 18:19:25,336 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
checking if agent (5) is alive
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239:
Sending  { Cmd , MgmtId: 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}]
}
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239:
Executing:  { Cmd , MgmtId: 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}] }
> 2013-12-23 18:19:25,339 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-325:ctx-39f5ed39)
Seq 5-1482556239: Executing request
> 2013-12-23 18:19:25,339 DEBUG [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-325:ctx-39f5ed39)
POST request tohttp://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand
with contents{"contextMap":{},"wait":50}
> 2013-12-23 18:19:25,340 DEBUG [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-325:ctx-39f5ed39)
Sending cmd to http://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand
cmd data:{"contextMap":{},"wait":50}
> 2013-12-23 18:19:46,345 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7)
checking if agent (5) is alive
> 2013-12-23 18:19:46,347 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7)
sending ping from (1) to agent's host ip address (10.147.40.31)
> 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876:
Sending  { Cmd , MgmtId: 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}]
}
> 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876:
Executing:  { Cmd , MgmtId: 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}] }
> 2013-12-23 18:19:46,350 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-353:ctx-a48feb80)
Seq 1-790364876: Executing request
> 2013-12-23 18:19:46,350 INFO  [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-353:ctx-a48feb80)
Executing resource PingTestCommand: {"_computingHostIp":"10.147.40.31","contextMap":{},"wait":20}
> 2013-12-23 18:19:46,351 ERROR [c.c.h.h.r.HypervDirectConnectResource] (DirectAgent-353:ctx-a48feb80)
Unable to execute ping command on DomR (null), domR may not be ready yet. failure due to There
was a problem while connecting to null:3922
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-353:ctx-a48feb80)
Seq 1-790364876: Response Received:
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] (DirectAgent-353:ctx-a48feb80) Seq 1-790364876:
Processing:  { Ans: , MgmtId: 132129494109518, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"PingTestCommand
failed","wait":0}}] }
> 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876:
Received:  { Ans: , MgmtId: 132129494109518, via: 1, Ver: v1, Flags: 10, { Answer } }
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.AbstractInvestigatorImpl] (AgentTaskPool-16:ctx-be3804c7)
host (10.147.40.31) cannot be pinged, returning null ('I don't know')
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.UserVmDomRInvestigator] (AgentTaskPool-16:ctx-be3804c7)
could not reach agent, could not reach agent's host, returning that we don't have enough information
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
PingInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
ManagementIPSysVMInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
KVMInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
VMwareInvestigator unable to determine the state of the host.  Moving on.
> 2013-12-23 18:19:46,351 WARN  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-16:ctx-be3804c7)
Agent state cannot be determined, do nothing
> Attaching MS log and cloud DB.
> Agent 5 is the host which was powered off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message