cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
Date Thu, 24 Nov 2016 17:59:58 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693923#comment-15693923
] 

ASF GitHub Bot commented on CLOUDSTACK-9595:
--------------------------------------------

Github user rafaelweingartner commented on the issue:

    https://github.com/apache/cloudstack/pull/1762
  
    @serg38 I have just now started reading this PR (excuse me if I overlooked some information).
    
    > If we are to try to implement a general way of dealing with deadlocks in ACS how
could it be done to ensure DB consistency and correct transaction retry?
    
    Answering your question; in my opinion, we should not “try” to implement a general
way of managing transactions. We are only having this type of problem because instead of using
a framework to manage access and transactions in databases, it was developed a module to do
that and incorporated to ACS; this means we have to maintain and live with this code. 
    
    Now, the problem is that it would be a Dantesque task to change the way ACS manages transactions
today.
    
    I am with John on this one, retrying is not a good idea; it can hide problems, cause overheads
and cause even more headaches.  I think that the best approach is to deal with this type of
problem on the fly; this means, as John said, addressing them as bugs when they are reported.
    
    Having said that, I have not helped a bit to solve the problem… Let’s see if I can
be of any help. 
    
    I was reading the ticket #CLOUDSTACK-9595. It seems that the problem (reported there)
happened when a VM was being removed from a table “instance_group_vm_map”. I just do not
understand because the method called is “UserVmManagerImpl.addInstanceToGroup”. I am hoping
that this makes sense. Anyways…
    
    The MYSQL docs have the following on deadlocks:
    > A deadlock is a situation where different transactions are unable to proceed because
each holds a lock that the other needs
    
    This means, there was something else being executed when that VM was deleted/added, and
this caused the deadlock and the exception. Probably something else is using the table “instance_group_vm_map”.
    
    I think we should track these two tasks/processes that can cause the problem and work
them out, instead of looking for a generic way to deal with this situation. Maybe these processes
that are causing deadlock are locking tables that are not needed or executing some processing
that could be avoided or modified.
    
    Do we use case that can reproduce the problem? 


> Transactions are not getting retried in case of database deadlock errors
> ------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9595
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>    Affects Versions: 4.8.0
>            Reporter: subhash yedugundla
>             Fix For: 4.8.1
>
>
> Customer is seeing occasional error 'Deadlock found when trying to get lock; try restarting
transaction' messages in their management server logs.  It happens regularly at least once
a day.  The following is the error seen 
> 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] (catalina-exec-3:ctx-f05c58fc ctx-39c17156
ctx-7becdf6e) unhandled exception executing api command: [Ljava.lang.String;@230a6e7f
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@74f134e3:
DELETE FROM instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374
> 	at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209)
> 	at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> 	at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> 	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> 	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> 	at com.sun.proxy.$Proxy237.expunge(Unknown Source)
> 	at com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593)
> 	at com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> 	at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57)
> 	at com.cloud.utils.db.Transaction.execute(Transaction.java:45)
> 	at com.cloud.utils.db.Transaction.execute(Transaction.java:54)
> 	at com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575)
> 	at com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message