cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Weingärtner (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CLOUDSTACK-8949) Job timeout caused by improper/inconsistent application of wait parameters
Date Tue, 07 Mar 2017 16:59:38 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-8949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899768#comment-15899768
] 

Rafael Weingärtner edited comment on CLOUDSTACK-8949 at 3/7/17 4:59 PM:
------------------------------------------------------------------------

LOL, I was being cited here, and I was not even aware.
No, it is not a bug. It is more a documentation/standardization issue.

I will change the information on this and on the other (CLOUDSTACK-8946) to improvements on
code and docs.


was (Author: rafaelweingartner):
LOL, I was being cited here, and I was not even aware of.
No, it is not a bug. It is more a documentation/standardization issue.

I will change the information on this and on the other (CLOUDSTACK-8946) to improvements on
code and docs.

> Job timeout caused by improper/inconsistent application of wait parameters
> --------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-8949
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8949
>             Project: CloudStack
>          Issue Type: Improvement
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Doc, Hypervisor Controller
>            Reporter: Ryan Farrington
>            Priority: Minor
>             Fix For: Future
>
>
> when creating a "MigrateVolumeCommand" a timeout should be set, this way the ACS would
not use the default value of parameter "wait".  That timeout value should be the same as the
one used on CitrixResourceBase and its children to control the migration of volumes.
> Environment:    XenServer 6.2, multiple pre-setup primary storage locations that are
iscsi SRs in XenServer.
> Summary of actions:      Attempt to move a volume that is attached to a running instance
to another primary storage location.
> What *I* assume should happen:    Cloudstack communicates with the host of the VM in
question and issues a xe vdi-pool-migrate.   The command is synchronous and doesn't return
control back until it has performed the action.   Cloudstack should know this is a synchronous
call and wait the configured amount of time for the migration to be performed.
> What *I* think is happening(with some education from Rafael Weingärtner):    There are
multiple timers being set in the process (please reference  
> https://issues.apache.org/jira/browse/CLOUDSTACK-8946 for the strangely documented and
implemented wait parameters) when the timer that should be used for all the timeouts is the
one that is currently defined as the migratewait.  
> Here is a full transaction of a 1TB volume migration for instance i-34-311.
> {noformat}
> 2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Sending  { Cmd , MgmtId: 42756806312036, via: 38(xen-nc-bc2b7), Ver: v1,
Flags: 100111, [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
}
> 2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Executing:  { Cmd , MgmtId: 42756806312036, via: 38(xen-nc-bc2b7), Ver:
v1, Flags: 100111, [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
}
> 2015-10-12 16:41:20,457 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-494:ctx-5aac3f29)
Seq 38-996939857: Executing request
> 2015-10-12 17:41:20,457 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Waiting some more time because this is the current command
> 2015-10-12 18:41:20,457 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Waiting some more time because this is the current command
> 2015-10-12 18:41:20,457 INFO  [c.c.u.e.CSExceptionErrorCode] (Job-Executor-63:ctx-f7b6817d
ctx-c6b92515) Could not find exception: com.cloud.exception.OperationTimedoutException in
error code list for exceptions
> 2015-10-12 18:41:20,465 WARN  [c.c.a.m.AgentAttache] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Timed out on Seq 38-996939857:  { Cmd , MgmtId: 42756806312036, via: 38(xen-nc-bc2b7),
Ver: v1, Flags: 100111, [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
}
> 2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: Cancelling.
> 2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-63:ctx-f7b6817d ctx-c6b92515)
Seq 38-996939857: No more commands found
> 2015-10-12 18:41:20,465 DEBUG [o.a.c.s.RemoteHostEndPoint] (Job-Executor-63:ctx-f7b6817d
ctx-c6b92515) Failed to send command, due to Agent:38, com.cloud.exception.OperationTimedoutException:
Commands 996939857 to Host 38 timed out after 7200
> 2015-10-12 18:41:20,471 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-63:ctx-f7b6817d
ctx-c6b92515) copy failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:38,
com.cloud.exception.OperationTimedoutException: Commands 996939857 to Host 38 timed out after
7200
>         at org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage(RemoteHostEndPoint.java:116)
>         at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.migrateVolumeToPool(AncientDataMotionStrategy.java:382)
>         at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:421)
>         at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
>         at org.apache.cloudstack.storage.volume.VolumeServiceImpl.migrateVolume(VolumeServiceImpl.java:931)
>         at com.cloud.storage.VolumeApiServiceImpl.liveMigrateVolume(VolumeApiServiceImpl.java:1680)
>         at com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:1666)
>         at com.cloud.storage.VolumeApiServiceImpl.migrateVolume(VolumeApiServiceImpl.java:1622)
>         at sun.reflect.GeneratedMethodAccessor335.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:622)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at com.sun.proxy.$Proxy196.migrateVolume(Unknown Source)
>         at org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd.execute(MigrateVolumeCmd.java:103)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
>         at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
>         at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:701)
> 2015-10-12 18:41:20,479 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-63:ctx-f7b6817d
ctx-c6b92515) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@4fa7a45f),
no need to delete from object in store ref table
> 2015-10-12 18:41:20,479 DEBUG [c.c.s.VolumeApiServiceImpl] (Job-Executor-63:ctx-f7b6817d
ctx-c6b92515) migrate volume failed:com.cloud.utils.exception.CloudRuntimeException: Failed
to send command, due to Agent:38, com.cloud.exception.OperationTimedoutException: Commands
996939857 to Host 38 timed out after 7200
> 2015-10-12 18:41:20,480 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-63:ctx-f7b6817d)
Complete async job-5257, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
to migrate volume"}
> 2015-10-12 18:41:20,486 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-63:ctx-f7b6817d)
Done executing org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd for job-5257
> 2015-10-12 18:41:20,489 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Job-Executor-63:ctx-f7b6817d)
Remove job-5257 from job monitoring
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message