cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-2568) ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)
Date Wed, 22 May 2013 19:16:20 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664426#comment-13664426
] 

ASF subversion and git services commented on CLOUDSTACK-2568:
-------------------------------------------------------------

Commit 78186c3b0201ecf55779b3f4bb6a3105fec1288d in branch refs/heads/4.1 from [~prachidamle]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=78186c3 ]

CLOUDSTACK-2568: ACS41 regression in storage subsystem (seen with local storage and 2 or more
hosts)

Patch for 4.1, changes:
- In VolumeReservationVO, the getter method of a column had a typo, causing us to create a
wrong searchbuilder. It was searching over the 'id' column instead of 'vm_reservation_id'
causing
- This bug was causing the vm deployment to choose a wrong pool during deployment since the
search was choosing incorrectly
- This bug in the GenericSearchBuilder is also fixed - if the getter method does not use the
standard 'get' or 'is' prefix, one should annotate that method using
 @Column(name = "<column_name>") and indicate which column this method refers to. This
will cause the GenericSearchBuilder to identify the field correctly.
- Also, let planner search for pools instead of selecting the one reserved - because there
is no way currently to pass multiple pool information to the planner and this may cause issues
when a VM has multiple disks.

Signed-off-by: Chip Childers <chip.childers@gmail.com>

                
> ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)
> -----------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-2568
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2568
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.1.0
>         Environment: RHES64 as in OEL64. Install from RPM built from latest GIT on OEL64.
> 2 or more KVM hypervisors with local storage in one cluster that has one primary NFS
storage pool.
>            Reporter: Ove Ewerlid
>            Assignee: Prachi Damle
>            Priority: Blocker
>             Fix For: 4.1.0
>
>         Attachments: var-log-cloudstack-management.tar.gz
>
>
> ACS402 works with no issues when tested in exactly the setup where ACS41 fails.
> Identical configuration (the same setup program is used for testing both versions).
> In ACS410 startVM fails if and only if the advanceStart: log line picks a poolID that
is not valid.
> E.g., the poolID reported in this logline appears random across a large number of tests.
> If a poolID that can not be reached by the host selected for deployment, the startVM
fails.
> This is blocking upgrade from 4.0 to 4.1 since there is no  reliable way to start VMs
that have been deployed. If a deployed VM fails to start, giving the startVM command multiple
times, will eventually make the VM start.
> The more hosts there are, the less likely it is a startVM will succeed. It is less likely
that the poolID is correct.
> The below log portion conveys how the VM has a "correct" Deployment Destination reported
and the advanceStart reports a poolID that is different and since the selected hypervisor
can not reach the poolID the startVM fails.
> The bug never triggers if there is only one KVM with local storage since the poolId can
not be wrong, there is just one (and the NFS pool is always valid).
> -------------------
> 2013-05-20 06:49:29,477 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Found a potential host id: 1 name: vm3-net0-s0-14.test.devops and associated storage pools
for this VM
> 2013-05-20 06:49:29,478 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Returning Deployment Destination: Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(1)-Pod(1)-Cluster(1)-Host\
> (1)-Storage(Volume(10|ROOT-->Pool(200), Volume(11|DATADISK-->Pool(200))]
> 2013-05-20 06:49:29,495 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-34:job-34)
VM state transitted from :Stopped to Starting with event: StartRequestedvm's original host
id: null new host id: null host id before state trans\
> ition: null
> 2013-05-20 06:49:29,495 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34)
Successfully transitioned to start state for VM[User|testvm-a] reservation id = e644d55e-3627-4395-9f89-639e6fc2f261
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34)
Trying to deploy VM, vm has dcId: 1 and podId: null
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34)
advanceStart: DeploymentPlan is provided, using dcId:1, podId: 1, clusterId: 1, hostId: 1,
poolId: 201
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34)
Deploy avoids pods: null, clusters: null, hosts: null
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
DeploymentPlanner allocation algorithm: random
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Trying to allocate a host and storage pools from dc:1, pod:1,cluster:1, requested cpu: 4000,
requested ram: 2147483648
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Is ROOT volume READY (pool already allocated)?: Yes
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
DeploymentPlan has host_id specified, making no checks on this host, looks like admin test:
1
> 2013-05-20 06:49:29,505 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Looking for suitable pools for this host under zone: 1, pod: 1, cluster: 1
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Checking suitable pools for volume (Id, Type): (10,ROOT)
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Volume has pool(201) already allocated, checking if pool can be reused, poolId: null
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
finding pool by id '201'
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Planner need not allocate a pool for this volume since its READY
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Checking suitable pools for volume (Id, Type): (11,DATADISK)
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Volume has pool(201) already allocated, checking if pool can be reused, poolId: null
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
finding pool by id '201'
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Planner need not allocate a pool for this volume since its READY
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Trying to find a potenial host and associated storage pools from the suitable host/pool lists
for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Checking if host: 1 can access any suitable storage pool for volume: DATADISK
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Host: 1 cannot access pool: 201
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Could not find a potential host that has associated storage pools from the suitable host/pool
lists for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34)
Cannnot deploy to specified host, returning.
> 2013-05-20 06:49:29,524 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-34:job-34)
VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host
id: null new host id: null host id before state tran\
> sition: null
> 2013-05-20 06:49:29,533 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-34:job-34)
Unexpected exception while executing org.apache.cloudstack.api.command.user.vm.StartVMCmd
> com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment
for VM[User|testvm-a]Scope=interface com.cloud.dc.DataCenter; id=1
>         at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:728)
>         at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:471)
>         at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:212)
>         at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
>         at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3865)
>         at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2573)
>         at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:162)
>         at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:437)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message