cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangeetha Hariharan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CLOUDSTACK-3451) Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of the Vm deployment fails when “applying dhcp entry/applying userdata and password entry on router” and retry eventually happens when they succeed.
Date Wed, 10 Jul 2013 18:23:49 GMT

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sangeetha Hariharan updated CLOUDSTACK-3451:
--------------------------------------------

    Attachment: xenparallel.rar
    
> Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of the Vm deployment
fails when “applying dhcp entry/applying userdata and password entry on router”  and retry
eventually happens when they succeed.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-3451
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3451
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.2.0
>         Environment: Build from 4.2
>            Reporter: Sangeetha Hariharan
>            Priority: Critical
>             Fix For: 4.2.0
>
>         Attachments: xenparallel.rar
>
>
> Steps to reproduce the problem:
> Advanced zone set up with Xenserver host.
> Deploy 30 Vms in parallel.
> 9 of the Vm deployments actually had a failure when trying to “apply userdata and password
entry on router” / “apply dhcp entry”. But in all these cases I see that we stop the
Vm that is in “Starting” state and immediately attempt to start the Vm which succeeds
this time.
> Issues:
> 1. We should not be seeing any failures during the jobs for "applying userdata and password
entry on router” / “applying dhcp entry”.
> 2. Why is there a logic to retry the Vm deployment again ? In my case I have only 1 host
in the setup which seems to be put in avoid state as part of the initial failure. But again
the Vm gets successfully deployed in this host.
> Jobs that went thru this scenario are:
> [root@asfmgmt management]# grep -i "stopcommand" management-server.log  | grep Executing
> 2013-07-09 16:30:05,707 DEBUG [agent.transport.Request] (Job-Executor-18:job-18) Seq
1-729350370: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-9-VM","wait":0}}]
}
> 2013-07-09 16:30:12,547 DEBUG [agent.transport.Request] (Job-Executor-15:job-15) Seq
1-729350377: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-7-VM","wait":0}}]
}
> 2013-07-09 16:30:20,990 DEBUG [agent.transport.Request] (Job-Executor-38:job-38) Seq
1-729350381: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-26-VM","wait":0}}]
}
> 2013-07-09 16:30:23,529 DEBUG [agent.transport.Request] (Job-Executor-25:job-25) Seq
1-729350383: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-18-VM","wait":0}}]
}
> 2013-07-09 16:30:32,231 DEBUG [agent.transport.Request] (Job-Executor-35:job-35) Seq
1-729350390: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
}
> 2013-07-09 16:30:45,744 DEBUG [agent.transport.Request] (Job-Executor-17:job-17) Seq
1-729350407: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-8-VM","wait":0}}]
}
> 2013-07-09 16:30:46,511 DEBUG [agent.transport.Request] (Job-Executor-36:job-36) Seq
1-729350408: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-25-VM","wait":0}}]
}
> 2013-07-09 16:31:04,826 DEBUG [agent.transport.Request] (Job-Executor-35:job-35) Seq
1-729350422: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
}
> 2013-07-09 16:31:17,707 DEBUG [agent.transport.Request] (Job-Executor-26:job-26) Seq
1-729350432: Executing:  { Cmd , MgmtId: 7200344900649, via: 1, Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-15-VM","wait":0}}]
}
> Management server log snippet:
> 2013-07-09 16:30:22,214 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-38:job-38)
Successfully cleanued up resources for the vm VM[U
> ser|hello-16] in Starting state
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
Deploy avoids pods: null, clusters: null,
> hosts: [1]
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
DataCenter id = '1' provided is in avoid s
> et, DeploymentPlanner cannot allocate the VM, returning.
> 2013-07-09 16:30:22,236 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
VM state transitted from :Starting to Stopped with
> event: OperationFailedvm's original host id: null new host id: null host id before state
transition: 1
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
Hosts's actual total CPU: 9044 and CPU after apply
> ing overprovisioning: 9044
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
Hosts's actual total RAM: 16190149248 and RAM afte
> r applying overprovisioning: 16190149632
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
release cpu from host: 1, old used: 4600,reserved:
> 0, actual total: 9044, total with overprovisioning: 9044; new used: 4500,reserved:0;
movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
release mem from host: 1, old used: 9602859008,res
> erved: 0, total: 16190149632; new used: 9340715008,reserved:0; movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,256 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-38:job-38)
VM state transitted from :Stopped to Starting with
> event: StartRequestedvm's original host id: null new host id: null host id before state
transition: null
> 2013-07-09 16:30:22,256 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-38:job-38)
Successfully transitioned to start state for VM[Us
> er|hello-16] reservation id = 8b8d8303-0f78-4f03-9ee3-4cc3e129c746
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-38:job-38)
Trying to deploy VM, vm has dcId: 1 and podId: 1
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-38:job-38)
Deploy avoids pods: null, clusters: null, hosts: n
> ull
> 2013-07-09 16:30:22,269 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
Deploy avoids pods: null, clusters: null,
> hosts: null
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
DeploymentPlanner allocation algorithm: co
> m.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_b2132c10@13aee390
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
Trying to allocate a host and storage pool
> s from dc:1, pod:1,cluster:null, requested cpu: 100, requested ram: 262144000
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
Is ROOT volume READY (pool already allocat
> ed)?: No
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-38:job-38)
Searching resources only under specified Pod: 1
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-38:job-38)
Listing clusters in order of aggregate capacity, that ha
> ve (atleast one host with) enough CPU and RAM capacity under this Pod: 1
> 2013-07-09 16:30:22,280 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-38:job-38)
Checking resources in Cluster: 1 under Pod
> : 1
> 2013-07-09 16:30:22,280 DEBUG [allocator.impl.FirstFitAllocator] (Job-Executor-38:job-38
FirstFitRoutingAllocator) Looking for hosts in dc: 1
> pod:1  cluster:1
> 2013-07-09 16:30:22,282 DEBUG [allocator.impl.FirstFitAllocator] (Job-Executor-38:job-38
FirstFitRoutingAllocator) FirstFitAllocator has 1 hos
> ts to check for allocation: [Host[-1-Routing]]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message