cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-4911) [Mixed Hypervisor] VM Status is marked as alive when exit status of ping command is not available within command timeout
Date Mon, 21 Oct 2013 16:36:43 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800780#comment-13800780
] 

ASF subversion and git services commented on CLOUDSTACK-4911:
-------------------------------------------------------------

Commit b6a13d125773371813734e87bd39c6030707f97c in branch refs/heads/4.2 from [~sateeshc]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=b6a13d1 ]

CLOUDSTACK-4911 - [Mixed Hypervisor] VM Status is marked as alive when exit status of ping
command is not available within command timeout

Currently during ssh execution of remote command, if no response is received within timeout,
Cloudstack is returning success result.
This is resulting in false positives. Fix is to check if exit status of remote command is
available or not. If not, return failure result.

Signed-off-by: Sateesh Chodapuneedi <sateesh@apache.org>


> [Mixed Hypervisor] VM Status is marked as alive when exit status of ping command is not
available within command timeout
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4911
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4911
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: VMware
>    Affects Versions: 4.2.0
>         Environment: Zone with a KVM cluster and VMware cluster
>            Reporter: Sateesh Chodapuneedi
>            Assignee: Sateesh Chodapuneedi
>             Fix For: 4.2.1
>
>
> Setup:
> 1-KVM-cluster with two hosts host1,host2
> 2-Vmware cluster with 1 host host3
> 3-In KVM cluster create HAenabled VM1 System vms including (virtual router1) VR1 is running
on host1 Rack2host17
> 4-In vmware cluster create HAenabled VM2 on host3 (vmware ) VR2 +1 guest vm is running
on host3 51.4
> 5-Deploy a HA enable VM3 on host2 Rack2Host18
> Steps:
> 1) Create KVM Instance which connect to VMWare Virtual Router
>  Instance Name:v-cl-test-10658-003-M00000002
>  Network:PublicFrontSegment-VM
>  Virtual ROuter: r-13123-VM
> 2) Migrate the Instance to the host(tckktky4-pbhpv081) which will be down
> 3) Shutdown the host(tckktky4-pbhpv081)
>  17:27 tckktky4-pbhpv081 shutdown
> 4) Host down detected
> 2013-05-08 17:32:24,233 WARN [agent.manager.AgentAttache]
>  (StatsCollector-2:null) Seq 177-582680794: Timed out on null
> 2013-05-08 17:32:24,233 WARN [agent.manager.AgentManagerImpl]
>  (StatsCollector-2:null) Operation timed out: Commands 582680794 to Host 177 timed out
after 3600
> ...
> 2013-05-08 17:32:28,552 DEBUG [cloud.ha.UserVmDomRInvestigator]
>  (HA-Worker-1:work-633) user vm v-cl-test-10658-003-M00000002 has been
>  successfully pinged, returning that it is alive
>  ★ after detecting ping 100% loss, confirmed Instance alive in the log
> ・・・
> 2013-05-08 17:32:28,552 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
>  (HA-Worker-1:work-633) Rescheduling because the host is not up but the vm is alive
> =====
> VM HA re-scheduling was repeated for 8 times and succeeded after failure of 7 times to
start VM. In 8th attempt VM got HAed to other KVM host.
> Root cause is : Exit status of ping command is not available within command timeout of
20 seconds.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message