Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5B597200BCB for ; Thu, 24 Nov 2016 19:00:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 59F1E160B1E; Thu, 24 Nov 2016 18:00:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7A6FB160AFB for ; Thu, 24 Nov 2016 18:59:59 +0100 (CET) Received: (qmail 23536 invoked by uid 500); 24 Nov 2016 17:59:58 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 23527 invoked by uid 500); 24 Nov 2016 17:59:58 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 23524 invoked by uid 99); 24 Nov 2016 17:59:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 17:59:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 663902C03DE for ; Thu, 24 Nov 2016 17:59:58 +0000 (UTC) Date: Thu, 24 Nov 2016 17:59:58 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 24 Nov 2016 18:00:00 -0000 [ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D15693923#comment-15693923 ]=20 ASF GitHub Bot commented on CLOUDSTACK-9595: -------------------------------------------- Github user rafaelweingartner commented on the issue: https://github.com/apache/cloudstack/pull/1762 =20 @serg38 I have just now started reading this PR (excuse me if I overloo= ked some information). =20 > If we are to try to implement a general way of dealing with deadlocks= in ACS how could it be done to ensure DB consistency and correct transacti= on retry? =20 Answering your question; in my opinion, we should not =E2=80=9Ctry=E2= =80=9D to implement a general way of managing transactions. We are only hav= ing this type of problem because instead of using a framework to manage acc= ess and transactions in databases, it was developed a module to do that and= incorporated to ACS; this means we have to maintain and live with this cod= e.=20 =20 Now, the problem is that it would be a Dantesque task to change the way= ACS manages transactions today. =20 I am with John on this one, retrying is not a good idea; it can hide pr= oblems, cause overheads and cause even more headaches. I think that the be= st approach is to deal with this type of problem on the fly; this means, as= John said, addressing them as bugs when they are reported. =20 Having said that, I have not helped a bit to solve the problem=E2=80=A6= Let=E2=80=99s see if I can be of any help.=20 =20 I was reading the ticket #CLOUDSTACK-9595. It seems that the problem (r= eported there) happened when a VM was being removed from a table =E2=80=9Ci= nstance_group_vm_map=E2=80=9D. I just do not understand because the method = called is =E2=80=9CUserVmManagerImpl.addInstanceToGroup=E2=80=9D. I am hopi= ng that this makes sense. Anyways=E2=80=A6 =20 The MYSQL docs have the following on deadlocks: > A deadlock is a situation where different transactions are unable to = proceed because each holds a lock that the other needs =20 This means, there was something else being executed when that VM was de= leted/added, and this caused the deadlock and the exception. Probably somet= hing else is using the table =E2=80=9Cinstance_group_vm_map=E2=80=9D. =20 I think we should track these two tasks/processes that can cause the pr= oblem and work them out, instead of looking for a generic way to deal with = this situation. Maybe these processes that are causing deadlock are locking= tables that are not needed or executing some processing that could be avoi= ded or modified. =20 Do we use case that can reproduce the problem?=20 > Transactions are not getting retried in case of database deadlock errors > ------------------------------------------------------------------------ > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-959= 5 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the defa= ult.)=20 > Affects Versions: 4.8.0 > Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lo= ck; try restarting transaction' messages in their management server logs. = It happens regularly at least once a day. The following is the error seen= =20 > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] (catalina-exec-3:ctx-= f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception executing api comma= nd: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mys= ql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM instance_group_vm_map = WHERE instance_group_vm_map.instance_id =3D 941374 > =09at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > =09at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:606) > =09at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflec= tion(AopUtils.java:317) > =09at org.springframework.aop.framework.ReflectiveMethodInvocation.invoke= Joinpoint(ReflectiveMethodInvocation.java:183) > =09at org.springframework.aop.framework.ReflectiveMethodInvocation.procee= d(ReflectiveMethodInvocation.java:150) > =09at com.cloud.utils.db.TransactionContextInterceptor.invoke(Transaction= ContextInterceptor.java:34) > =09at org.springframework.aop.framework.ReflectiveMethodInvocation.procee= d(ReflectiveMethodInvocation.java:161) > =09at org.springframework.aop.interceptor.ExposeInvocationInterceptor.inv= oke(ExposeInvocationInterceptor.java:91) > =09at org.springframework.aop.framework.ReflectiveMethodInvocation.procee= d(ReflectiveMethodInvocation.java:172) > =09at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDyna= micAopProxy.java:204) > =09at com.sun.proxy.$Proxy237.expunge(Unknown Source) > =09at com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserV= mManagerImpl.java:2593) > =09at com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(Tran= sactionCallbackNoReturn.java:25) > =09at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:5= 7) > =09at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > =09at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > =09at com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl= .java:2575) > =09at com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerIm= pl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)