cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koushik Das (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CLOUDSTACK-3137) systemvms HA fails on latest master (4591f94)
Date Wed, 03 Jul 2013 23:57:20 GMT

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Koushik Das updated CLOUDSTACK-3137:
------------------------------------

    Affects Version/s: 4.2.0
    
> systemvms HA fails on latest master (4591f94)
> ---------------------------------------------
>
>                 Key: CLOUDSTACK-3137
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3137
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>    Affects Versions: 4.2.0
>            Reporter: Prasanna Santhanam
>            Assignee: Koushik Das
>            Priority: Blocker
>             Fix For: 4.2.0
>
>         Attachments: logs.tar.bz2
>
>
> SystemVMs fail to start during HA on latest master (4591f94a0bb69875dca769c9738998d3b5b96106)
> It looks like the HA of system VMs is failing. On our automated test environment we reboot
the management server before the systemVMs spin up to ensure that the global settings are
affected and the configuration for the secstorage.internal.sites is rightly set in the secondary
storage VM. The management server upon reboot discovers that SSVM was put to perform HA in
the op_ha_work table and starts to re-spin the SSVM. At this time the CPVM is also booting
up. No further activity is observed after two HA attempts.
> SystemVM start before Management server start: 
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,738 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(secstorage-1:null) Allocating entries for VM: VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,771 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(secstorage-1:null) Allocating nics for VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,774 DEBUG [cloud.network.NetworkManagerImpl]
(secstorage-1:null) Allocating nic for vm VM[SecondaryStorageVm|s-1-VM] in network Ntwk[200|Public|1]
with requested profile NicProfile[0-0-null-null-null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,843 DEBUG [cloud.network.NetworkManagerImpl]
(secstorage-1:null) Allocating nic for vm VM[SecondaryStorageVm|s-1-VM] in network Ntwk[202|Control|3]
with requested profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,870 DEBUG [cloud.network.NetworkManagerImpl]
(secstorage-1:null) Allocating nic for vm VM[SecondaryStorageVm|s-1-VM] in network Ntwk[201|Management|2]
with requested profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,916 DEBUG [cloud.network.NetworkManagerImpl]
(secstorage-1:null) Allocating nic for vm VM[SecondaryStorageVm|s-1-VM] in network Ntwk[203|Storage|4]
with requested profile null
> ...
> ...
> Jun 24 22:30:52 cloudstack-centos63 local0: 2013-06-25 05:30:52,552 DEBUG [agent.transport.Request]
(consoleproxy-1:null) Seq 3-2067660811: Sending  { Cmd , MgmtId: 200973787296321, via: 3,
Ver: v1, Flags: 100111, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/1/1/","origUrl":"http://nfs/templates/acton/acton-systemvm-02062012.vhd.bz2","uuid":"1","id":1,"format":"VHD","accountId":1,"checksum":"f613f38c96bf039f2e5cbf92fa8ad4f8","hvm":false,"displayText":"SystemVM
Template (XenServer)","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://nfs.fmt.vmops.com:/expo...
> Jun 24 22:31:01 cloudstack-centos63 local0: 2013-06-25 05:31:01,824 INFO  [context.support.XmlWebApplicationContext]
(Thread-26:null) Closing Root WebApplicationContext: startup date [Mon Jun 24 23:25:50 UTC
2013]; root of context hierarchy
> ^^^^^^^^^^ Management Server Restarts ^^^^^^^^^^^^^^
> After restart SSVM scheduled for HA
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,394 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(Timer-1:null) Schedule vm for HA:  VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,455 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Timer-1:null) Handling unfinished work item: ItWork[1805a89b-d03a-4f1f-b717-ac9f5da6edea-Starting-2-Prepare]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,504 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(Timer-1:null) Schedule vm for HA:  VM[ConsoleProxy|v-2-VM]
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,540 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:null) Starting work
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,547 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-1:null) Starting work
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,815 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-1) Processing HAWork[1-HA-1-Starting-Investigating]
> ...
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,289 DEBUG [cloud.ha.CheckOnAgentInvestigator]
(HA-Worker-0:work-2) Unable to reach the agent for VM[ConsoleProxy|v-2-VM]: Resource [Host:3]
is unreachable: Host 3: Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,290 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-2) SimpleInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,293 DEBUG [cloud.ha.CheckOnAgentInvestigator]
(HA-Worker-2:work-1) Unable to reach the agent for VM[SecondaryStorageVm|s-1-VM]: Resource
[Host:2] is unreachable: Host 2: Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,298 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-1) SimpleInvestigator found VM[SecondaryStorageVm|s-1-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,336 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-2) XenServerInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG [cloud.ha.UserVmDomRInvestigator]
(HA-Worker-0:work-2) Not a User Vm, unable to determine state of VM[ConsoleProxy|v-2-VM] returning
null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-2) null found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG [cloud.ha.ManagementIPSystemVMInvestigator]
(HA-Worker-0:work-2) Testing if VM[ConsoleProxy|v-2-VM] is alive
> ....
> Rescheduled HA
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,623 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-2) Rescheduling HAWork[2-HA-2-Starting-Investigating] to try again at Tue
Jun 25 05:43:07 UTC 2013
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,626 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-1) Rescheduling HAWork[1-HA-1-Starting-Investigating] to try again at Tue
Jun 25 05:43:07 UTC 2013
> ...
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,586 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-3:work-1) Processing HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,592 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-3:work-1) HA on VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-3:work-1) VM VM[SecondaryStorageVm|s-1-VM] has been changed.  Current State = Starting
Previous State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-3:work-1) Completed HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,611 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-2) Processing HAWork[2-HA-2-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-2) HA on VM[ConsoleProxy|v-2-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-2) VM VM[ConsoleProxy|v-2-VM] has been changed.  Current State = Starting
Previous State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-2:work-2) Completed HAWork[2-HA-2-Starting-Investigating]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message