cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-7749) AsyncJob GC thread cannot purge queue items that have been blocking for too long if exception is thrown in expunging some unfinished or completed old jobs, this will make some future jobs stuck.
Date Tue, 21 Oct 2014 14:57:33 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178478#comment-14178478
] 

ASF subversion and git services commented on CLOUDSTACK-7749:
-------------------------------------------------------------

Commit 248e4fbdacb1372da94cbcf49ca23a14c8c9b514 in cloudstack's branch refs/heads/baremetal-systemvm
from [~minchen07]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=248e4fb ]

CLOUDSTACK-7749: AsyncJob GC thread cannot purge queue items that have been blocking for too
long if exception is thrown in expunging some unfinished or completed old jobs, this will
make some future jobs stuck.


> AsyncJob GC thread cannot purge queue items that have been blocking for too long if exception
is thrown in expunging some unfinished or completed old jobs, this will make some future jobs
stuck.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-7749
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7749
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>            Reporter: Min Chen
>            Assignee: Min Chen
>            Priority: Critical
>             Fix For: 4.5.0
>
>
> AsyncJobManager has a GC thread to clean up some unfinished job and complete jobs that
are too old. In this same thread, we are also forcefully cancel blocking queue items if they've
been staying there for too long. Currently if there is an exception thrown in expunging one
job, for example, like the one below:
> 2014-10-14 17:57:26,347 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-67ae4177)
Expunging unfinished job AsyncJobVO
> {id:18443, userId: 60, accountId: 69, instanceType: null, instanc eId: null, cmd: com.cloud.vm.VmWorkStart,
cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIACkoABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljb
HVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL
01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAABFAAAAAAAAADwAAAAAAAACdXQAGVZpcnR1YWxNYWNoaW5lTWFuY
WdlckltcGwAAAAAAAAAAHBwcHBwcHBzcgARamF2YS51dGlsLkhhc2hNYXAFB9rBwxZg0QMAAkYACmxvYWRGYWN0b3JJAAl0aHJlc2hvbGR4cD9AAAAAAAAMdwgAAAAQAAAAAXQAClZtUGFzc3dvcmR0ABxyTzBBQlhRQURuTmhkbVZrWDNCaGMzTjNiM0preHA,
cmdVersi on: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid:
244536014864905, completeMsid: null, lastUpdated: null, lastPolled: null, created: Wed Aug
20 18:14:13 EDT 2014}
> 2014-10-14 17:57:26,350 DEBUG [c.c.u.d.T.Transaction] (AsyncJobMgr-Heartbeat-1:ctx-67ae4177)
Rolling back the transaction: Time = 2 Name = AsyncJobMgr-Heartbeat-1; called by -TransactionLegacy.rollback:8
> 96-TransactionLegacy.removeUpTo:839-TransactionLegacy.close:663-TransactionContextInterceptor.invoke:35-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocat
> ion.proceed:172-JdkDynamicAopProxy.invoke:204-$Proxy151.expunge:-1-AsyncJobManagerImpl$8.doInTransactionWithoutResult:802-TransactionCallbackNoReturn.doInTransaction:25-Transaction$2.doInTransaction:49
> 2014-10-14 17:57:26,368 ERROR [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-67ae4177)
Unexpected exception when trying to execute queue item,
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@39fdfb52:
DELETE FROM async_job WHERE async_job.id= 18443
> at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1144)
> at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:622)
> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at com.sun.proxy.$Proxy151.expunge(Unknown Source)
> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$8.doInTransactionWithoutResult(AsyncJobManagerImpl.java:802)
> at com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl.expungeAsyncJob(AsyncJobManagerImpl.java:799)
> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.reallyRun(AsyncJobManagerImpl.java:762)
> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.runInContext(AsyncJobManagerImpl.java:738)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:50)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:47)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
> Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException:
Cannot delete or update a parent row: a foreign key constraint fails (`cloud`.`async_job_join_map`,
CONSTRAINT `fk_async_job_join_map__join_job_id` FOREIGN KEY (`join_job_id`) REFERENCES `async_job`
(`id`))
> all the following purge queue item action will not get chances to run at all. This will
cause potential job stuck issues since we are serializing VM operations per VM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message