cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangeetha Hariharan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-4651) Restarting management server when volume Snapshot is still in progress for root volume of a VM , then there is no way to restart VM since the startVM job is stuck forever since the volume is in "Snapshoting" state.
Date Tue, 17 Sep 2013 20:09:52 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769870#comment-13769870
] 

Sangeetha Hariharan commented on CLOUDSTACK-4651:
-------------------------------------------------

Tested with the latest build:
Deploy a VM.
Initiate snapshot for root volume of this VM.

As soon as the VM snapshot command is issued , kill the management server process.

Start Management server.

Stop this VM.
Start this VM. 

starting Vm fails with error message - "Unable to create deployment, no usable volumes found
for the VM".

With CLOUDSTACK-4650 fixes , volume remains in "Snapshoting" state for a very short time and
getting into this state would be a timing issue.  	


Following exception seen in management server logs:

2013-09-17 12:50:31,176 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) ===START===  10.215.3.9
-- GET  command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217
2013-09-17 12:50:31,223 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-8:null) submit
async job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ], details: AsyncJobVO {id:53, userId:
3, accountId: 3, sessionKey: null, instanceType: VirtualMachine, instanceId: 10, cmd: org.apache.cloudstack.api.command.user.vm.StartVMCmd,
cmdOriginator: null, cmdInfo: {"id":"4a398643-5f9d-4c75-8174-a9bb62580538","response":"json","sessionkey":"TkwA+m9SCHW5H3aBJVqsWm4YB68\u003d","cmdEventType":"VM.START","ctxUserId":"3","httpmethod":"GET","_":"1379448058217","ctxAccountId":"3","ctxStartEventId":"202"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, processStatus: 0, resultCode:
0, result: null, initMsid: 161197867246747, completeMsid: null, lastUpdated: null, lastPolled:
null, created: null}
2013-09-17 12:50:31,226 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) ===END===  10.215.3.9
-- GET  command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217
2013-09-17 12:50:31,229 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [
605b349e-9745-41c1-87e2-2ae08c7072cf ]) Executing org.apache.cloudstack.api.command.user.vm.StartVMCmd
for job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]
2013-09-17 12:50:31,251 DEBUG [cloud.user.AccountManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf
]) Access to VM[User|sangee5] granted to Acct[2db522b2-192b-42bf-a9fd-f4a18fcbb51b-sangee]
by DomainChecker_EnhancerByCloudStack_57441200
2013-09-17 12:50:31,266 DEBUG [cloud.network.NetworkModelImpl] (Job-Executor-2:job-53 = [
605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service SecurityGroup is not supported in the network
id=208
2013-09-17 12:50:31,271 DEBUG [cloud.network.NetworkModelImpl] (Job-Executor-2:job-53 = [
605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service SecurityGroup is not supported in the network
id=208
2013-09-17 12:50:31,315 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Deploy avoids pods: [], clusters: [], hosts: []
2013-09-17 12:50:31,318 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) DeploymentPlanner allocation algorithm: com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_4e6f8d51@4807e0d7
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Trying to allocate a host and storage pools from
dc:2, pod:2,cluster:null, requested cpu: 100, requested ram: 130023424
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Is ROOT volume READY (pool already allocated)?:
No
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) This VM has last host_id specified, trying to
choose the same host: 7
2013-09-17 12:50:31,402 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Checking if host: 7 has enough capacity for requested
CPU: 100 and requested RAM: 130023424 , cpuOverprovisioningFactor: 1.0
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Hosts's actual total CPU: 9044 and CPU after applying
overprovisioning: 9044
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) We need to allocate to the last host again, so
checking if there is enough reserved capacity
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved CPU: 100 , Requested CPU: 100
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved RAM: 130023424 , Requested RAM: 130023424
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Host has enough CPU and RAM available
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can alloc CPU from host: 7, used: 2300,
reserved: 100, actual total: 9044, total with overprovisioning: 9044; requested cpu:100,alloc_from_last_host?:true
,considerReservedCapacity?: true
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can alloc MEM from host: 7, used: 2403336192,
reserved: 130023424, total: 16190149632; requested mem: 130023424,alloc_from_last_host?:true
,considerReservedCapacity?: true
2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) The last host of this VM is UP and has enough
capacity
2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53
= [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Now checking for suitable pools under zone: 2,
pod: 2, cluster: 2
2013-09-17 12:50:31,418 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [
605b349e-9745-41c1-87e2-2ae08c7072cf ]) Unexpected exception while executing org.apache.cloudstack.api.command.user.vm.StartVMCmd
com.cloud.utils.exception.CloudRuntimeException: Unable to create deployment, no usable volumes
found for the VM
        at com.cloud.deploy.DeploymentPlanningManagerImpl.findSuitablePoolsForVolumes(DeploymentPlanningManagerImpl.java:1059)
        at com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:358)
        at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:187)
        at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:198)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3405)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:1948)
        at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2013-09-17 12:50:31,450 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [
605b349e-9745-41c1-87e2-2ae08c7072cf ]) Complete async job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf
], jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Unable to create deployment,
no usable volumes found for the VM

                
> Restarting management server when volume Snapshot is still in progress for root volume
of a VM , then there is no way to restart VM since the startVM job is stuck forever  since
the volume is in "Snapshoting" state.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4651
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4651
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.2.1
>         Environment: Build from 4.2.-forward
>            Reporter: Sangeetha Hariharan
>            Assignee: Prachi Damle
>            Priority: Critical
>             Fix For: 4.2.1
>
>
> Restarting management server when volume Snapshot is still in progress for root volume
of a VM , then there is no way to restart VM since the startVM job is stuck forever since
the volume is in "Snapshoting" state.
> Steps to reproduce the problem:
> Deploy a VM.
> Initiate snapshot for root volume of this VM.
> When VM snapshot is in progress , stop the management server.
> Start Management server.
> Stop this VM.
> Start this VM.
> VM will never transition to "Starting" state and continues to be in "Stopped" state.
> The start VM async job never completes and hits an infinite loop in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message