cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaime Orlando Rojas Sanchez <jaime.ro...@kumo.com.co>
Subject RE: VM stuck in a failing Host
Date Thu, 20 Aug 2015 15:37:24 GMT
Following the logs when I click 'run' in ACS after did the following in the DB


-          Change the state to 'stopped'

-          Change host ID to a working host

-          Change last host ID to a working host

-          Check VR is up and running on a working host


-bash-4.1# tail -f management-server.log | grep 14584
2015-08-20 05:45:27,513 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-21:null) submit
async job-45973 = [ d3ed77b7-534e-4a3b-9038-a66359162087 ], details: AsyncJobVO {id:45973,
userId: 2, accountId: 2, sessionKey: null, instanceType: VirtualMachine, instanceId: 14584,
cmd: org.apache.cloudstack.api.command.user.vm.StopVMCmd, cmdOriginator: null, cmdInfo: {"response":"json","id":"98227dc9-682e-4f42-87e1-bd4b8045c7c9","sessionkey":"hwnxmM0He9EXs2craugKg3XyWL4\u003d","cmdEventType":"VM.STOP","ctxUserId":"2","httpmethod":"GET","_":"1440067422009","ctxAccountId":"2","ctxStartEventId":"38650949","forced":"true"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, processStatus: 0, resultCode:
0, result: null, initMsid: 139549854171544, completeMsid: null, lastUpdated: null, lastPolled:
null, created: null}
2015-08-20 05:46:33,404 DEBUG [agent.transport.Request] (Job-Executor-31:job-45974 = [ 9996fc72-fb83-4e5d-94c5-886396dac536
]) Seq 792-1073807425: Sending  { Cmd , MgmtId: 139549854171544, via: 792, Ver: v1, Flags:
100111, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/574/57764/28586b35-cb45-4565-bd9b-7aa46a2898da","volume":{"uuid":"a15d0923-0a25-408f-9d10-fd5d47b3fef9","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_3000GB_ADV_SATA1","id":211,"poolType":"PreSetup","host":"localhost","path":"/3PAR_3000GB_ADV_SATA1","port":0}},"name":"ROOT-14584","size":107374182400,"path":"c7a8eebc-7750-455c-804f-64c0d66cb4f4","volumeId":57764,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":57764,"hypervisorType":"XenServer"},"dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://172.16.4.65/vol/secondary_clpr","_role":"Image"}},"vmName":"i-574-14584-VM","name":"srvrasautos2_ROOT-14584_20141007233517","hypervisorType":"XenServer","id":11831}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/574/599","uuid":"70d21214-33d0-49e0-8b45-c7702b0fe579","id":599,"format":"RAW","accountId":574,"hvm":true,"displayText":"templateras","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://172.16.4.65/vol/secondary_clpr","_role":"Image"}},"name":"248e2097b-4af7-38f7-a851-029ef11f52cc","hypervisorType":"XenServer"}},"executeInSequence":true,"wait":10800}}]
}
2015-08-20 05:49:03,212 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-1:null) submit
async job-45975 = [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f ], details: AsyncJobVO {id:45975,
userId: 2, accountId: 2, sessionKey: null, instanceType: VirtualMachine, instanceId: 14584,
cmd: org.apache.cloudstack.api.command.user.vm.StartVMCmd, cmdOriginator: null, cmdInfo: {"response":"json","id":"98227dc9-682e-4f42-87e1-bd4b8045c7c9","sessionkey":"hwnxmM0He9EXs2craugKg3XyWL4\u003d","cmdEventType":"VM.START","ctxUserId":"2","httpmethod":"GET","_":"1440067641673","ctxAccountId":"2","ctxStartEventId":"38651246"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, processStatus: 0, resultCode:
0, result: null, initMsid: 139549854171544, completeMsid: null, lastUpdated: null, lastPolled:
null, created: null}
2015-08-20 05:49:05,821 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-32:job-45975
= [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f ]) Asking VirtualRouter to prepare for Nic[194602-14584-5d82a92d-b828-45bc-882a-b5ce17401812-172.16.100.244]
2015-08-20 05:49:08,947 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-32:job-45975
= [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f ]) Asking VirtualRouter to prepare for Nic[194621-14584-null-172.16.180.35]
2015-08-20 05:49:10,181 DEBUG [cloud.storage.VolumeManagerImpl] (Job-Executor-32:job-45975
= [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f ]) No need to recreate the volume: Vol[13270|vm=14584|DATADISK],
since it already has a pool assigned: 208, adding disk to VM
2015-08-20 05:49:10,184 DEBUG [cloud.storage.VolumeManagerImpl] (Job-Executor-32:job-45975
= [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f ]) No need to recreate the volume: Vol[57764|vm=14584|ROOT],
since it already has a pool assigned: 211, adding disk to VM
2015-08-20 05:49:10,271 DEBUG [agent.transport.Request] (Job-Executor-32:job-45975 = [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f
]) Seq 595-838074847: Sending  { Cmd , MgmtId: 139549854171544, via: 595, Ver: v1, Flags:
100111, [{"com.cloud.agent.api.StartCommand":{"vm":{"id":14584,"name":"i-574-14584-VM","bootloader":"PyGrub","type":"User","cpus":2,"minSpeed":525,"maxSpeed":2100,"minRam":4294967296,"maxRam":4294967296,"arch":"x86_64","os":"CentOS
6.0 (64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"63729fa8c6c9ecae","params":{"memoryOvercommitRatio":"1","platform":"viridian:true;acpi:true;apic:true;pae:true;nx:false","Message.ReservedCapacityFreed.Flag":"true","hypervisortoolsversion":"xenserver56","cpuOvercommitRatio":"4"},"uuid":"98227dc9-682e-4f42-87e1-bd4b8045c7c9","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"71095933-5ce2-4786-8527-1dbe11876004","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_2000GB_ADV_SAS","id":208,"poolType":"PreSetup","host":"localhost","path":"/3PAR_2000GB_ADV_SAS","port":0}},"name":"Datos","size":214748364800,"path":"69451be5-bd65-41d4-b465-08933d393498","volumeId":13270,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":13270,"hypervisorType":"XenServer"}},"diskSeq":1,"type":"DATADISK"},{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a15d0923-0a25-408f-9d10-fd5d47b3fef9","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_3000GB_ADV_SATA1","id":211,"poolType":"PreSetup","host":"localhost","path":"/3PAR_3000GB_ADV_SATA1","port":0}},"name":"ROOT-14584","size":107374182400,"path":"c7a8eebc-7750-455c-804f-64c0d66cb4f4","volumeId":57764,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":57764,"hypervisorType":"XenServer"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"040aaa12-4338-4181-bb60-9dc07aa804e8","ip":"172.16.100.244","netmask":"255.255.252.0","gateway":"172.16.100.1","mac":"02:00:3b:b1:00:16","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://3042","isolationUri":"vlan://3042","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"},{"deviceId":1,"networkRateMbps":200,"defaultNic":false,"uuid":"3ef8c0d3-83c0-4569-990a-577ecd21f707","ip":"172.16.180.35","netmask":"255.255.255.240","gateway":"172.16.180.33","mac":"06:07:8c:00:0f:0d","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://182","isolationUri":"vlan://182","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"}]},"hostIp":"172.16.1.11","executeInSequence":true,"wait":0}}]
}
2015-08-20 05:49:10,273 DEBUG [agent.transport.Request] (Job-Executor-32:job-45975 = [ 8ac23585-989d-4e3d-bcb9-3d3602842b8f
]) Seq 595-838074847: Executing:  { Cmd , MgmtId: 139549854171544, via: 595, Ver: v1, Flags:
100111, [{"com.cloud.agent.api.StartCommand":{"vm":{"id":14584,"name":"i-574-14584-VM","bootloader":"PyGrub","type":"User","cpus":2,"minSpeed":525,"maxSpeed":2100,"minRam":4294967296,"maxRam":4294967296,"arch":"x86_64","os":"CentOS
6.0 (64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"63729fa8c6c9ecae","params":{"memoryOvercommitRatio":"1","platform":"viridian:true;acpi:true;apic:true;pae:true;nx:false","Message.ReservedCapacityFreed.Flag":"true","hypervisortoolsversion":"xenserver56","cpuOvercommitRatio":"4"},"uuid":"98227dc9-682e-4f42-87e1-bd4b8045c7c9","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"71095933-5ce2-4786-8527-1dbe11876004","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_2000GB_ADV_SAS","id":208,"poolType":"PreSetup","host":"localhost","path":"/3PAR_2000GB_ADV_SAS","port":0}},"name":"Datos","size":214748364800,"path":"69451be5-bd65-41d4-b465-08933d393498","volumeId":13270,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":13270,"hypervisorType":"XenServer"}},"diskSeq":1,"type":"DATADISK"},{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a15d0923-0a25-408f-9d10-fd5d47b3fef9","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_3000GB_ADV_SATA1","id":211,"poolType":"PreSetup","host":"localhost","path":"/3PAR_3000GB_ADV_SATA1","port":0}},"name":"ROOT-14584","size":107374182400,"path":"c7a8eebc-7750-455c-804f-64c0d66cb4f4","volumeId":57764,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":57764,"hypervisorType":"XenServer"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"040aaa12-4338-4181-bb60-9dc07aa804e8","ip":"172.16.100.244","netmask":"255.255.252.0","gateway":"172.16.100.1","mac":"02:00:3b:b1:00:16","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://3042","isolationUri":"vlan://3042","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"},{"deviceId":1,"networkRateMbps":200,"defaultNic":false,"uuid":"3ef8c0d3-83c0-4569-990a-577ecd21f707","ip":"172.16.180.35","netmask":"255.255.255.240","gateway":"172.16.180.33","mac":"06:07:8c:00:0f:0d","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://182","isolationUri":"vlan://182","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"}]},"hostIp":"172.16.1.11","executeInSequence":true,"wait":0}}]
}
2015-08-20 05:49:10,402 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-434:null) VM
i-574-14584-VM is runing on host 53109eef-2f53-4f0e-a763-68817d573bd9
2015-08-20 05:49:10,403 DEBUG [xen.resource.CitrixResourceBase] (DirectAgent-434:null) The
VM is in stopped state, detected problem during startup : i-574-14584-VM
2015-08-20 05:49:10,404 DEBUG [agent.transport.Request] (DirectAgent-434:null) Seq 595-838074847:
Processing:  { Ans: , MgmtId: 139549854171544, via: 595, Ver: v1, Flags: 110, [{"com.cloud.agent.api.StartAnswer":{"vm":{"id":14584,"name":"i-574-14584-VM","bootloader":"PyGrub","type":"User","cpus":2,"minSpeed":525,"maxSpeed":2100,"minRam":4294967296,"maxRam":4294967296,"arch":"x86_64","os":"CentOS
6.0 (64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"63729fa8c6c9ecae","params":{"memoryOvercommitRatio":"1","platform":"viridian:true;acpi:true;apic:true;pae:true;nx:false","Message.ReservedCapacityFreed.Flag":"true","hypervisortoolsversion":"xenserver56","cpuOvercommitRatio":"4"},"uuid":"98227dc9-682e-4f42-87e1-bd4b8045c7c9","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"71095933-5ce2-4786-8527-1dbe11876004","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_2000GB_ADV_SAS","id":208,"poolType":"PreSetup","host":"localhost","path":"/3PAR_2000GB_ADV_SAS","port":0}},"name":"Datos","size":214748364800,"path":"69451be5-bd65-41d4-b465-08933d393498","volumeId":13270,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":13270,"hypervisorType":"XenServer"}},"diskSeq":1,"type":"DATADISK"},{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a15d0923-0a25-408f-9d10-fd5d47b3fef9","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"3PAR_3000GB_ADV_SATA1","id":211,"poolType":"PreSetup","host":"localhost","path":"/3PAR_3000GB_ADV_SATA1","port":0}},"name":"ROOT-14584","size":107374182400,"path":"c7a8eebc-7750-455c-804f-64c0d66cb4f4","volumeId":57764,"vmName":"i-574-14584-VM","accountId":574,"format":"VHD","id":57764,"hypervisorType":"XenServer"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"040aaa12-4338-4181-bb60-9dc07aa804e8","ip":"172.16.100.244","netmask":"255.255.252.0","gateway":"172.16.100.1","mac":"02:00:3b:b1:00:16","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://3042","isolationUri":"vlan://3042","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"},{"deviceId":1,"networkRateMbps":200,"defaultNic":false,"uuid":"3ef8c0d3-83c0-4569-990a-577ecd21f707","ip":"172.16.180.35","netmask":"255.255.255.240","gateway":"172.16.180.33","mac":"06:07:8c:00:0f:0d","dns1":"66.165.160.179","dns2":"66.165.160.180","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://182","isolationUri":"vlan://182","isSecurityGroupEnabled":false,"name":"VLAN3000-3010"}]},"host_guid":"53109eef-2f53-4f0e-a763-68817d573bd9","result":true,"details":"VM
i-574-14584-VM is runing on host 53109eef-2f53-4f0e-a763-68817d573bd9","wait":0}}] }
2015-08-20 05:49:19,614 WARN  [xen.resource.CitrixResourceBase] (DirectAgent-41:null) Detecting
a new state but couldn't find a old state so adding it to the changes: i-574-14584-VM
2015-08-20 05:49:19,615 DEBUG [agent.transport.Request] (DirectAgent-41:null) Seq 566-1609629709:
Processing:  { Ans: , MgmtId: 139549854171544, via: 566, Ver: v1, Flags: 10, [{"com.cloud.agent.api.ClusterSyncAnswer":{"_clusterId":5,"_newStates":{"i-574-14584-VM":{"t":"53109eef-2f53-4f0e-a763-68817d573bd9","u":"Running","v":"viridian:true;acpi:true;apic:true;pae:true;nx:false"}},"_isExecuted":false,"result":true,"wait":0}}]
}
2015-08-20 05:49:19,627 DEBUG [cloud.vm.VirtualMachineManagerImpl] (DirectAgent-41:null) VM
i-574-14584-VM: cs state = Running and realState = Running
2015-08-20 05:49:19,627 DEBUG [cloud.vm.VirtualMachineManagerImpl] (DirectAgent-41:null) VM
i-574-14584-VM: cs state = Running and realState = Running

Regards / Cordialmente,

Jaime O. Rojas S.
Technology Manager
jaime.rojas@kumo.com.co<mailto:jaime.rojas@kumo.com.co>
Mobile: +57 301-3382382
Office: +57-1-8766767 x215

De: Jaime Orlando Rojas Sanchez
Enviado el: jueves, 20 de agosto de 2015 9:54 a. m.
Para: 'users@cloudstack.apache.org'
Asunto: VM stuck in a failing Host

Hello,

We have a 4.2.1 ACS, running on XenServer 6.2.0, we have a zone with a pool of 3 host, yesterday
1 host crash and OS get corrupted. I think we lost that host and have to reinstall it, but
the issue is that we had a couple of VM and VR running on that host. The failing host  was
the master of the pool, so once it fails all the pool was disconnected, we change the master
role and recover pool management from Xencenter and ACS, once we did it a VM moved to the
remaining host, all VR and 1 VM kept stuck in failing host.

In DB we see the VR and VM running, even if the host was marked as down and maintenance. We
changed the VR state to 'stopped' and change de "last host ID" and "Host ID" to a working
host. Once we did it we were able to destroy the VR and recreate them with successful results,
they came up on working host. If we change only the state, the VR couldn't be destroyed. Here
we workaround with the 70% of the outage, BUT one VM remain stuck to the host, we change the
state, the last host ID, but once we press start, it "runs" on the failing host and the VM
appears as running even if it doesn't. Any suggestion to force the VM to start in a different
host and remove it from the failing host? This is a critical VM, we hope somebody else could
give us a hand.

Regards / Cordialmente,

Jaime O. Rojas S.
Technology Manager
jaime.rojas@kumo.com.co<mailto:jaime.rojas@kumo.com.co>
Mobile: +57 301-3382382
Office: +57-1-8766767 x215


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message