cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-9595) Transactions are not getting retried in case of database deadlock errors
Date Thu, 24 Nov 2016 18:25:58 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693970#comment-15693970
] 

ASF GitHub Bot commented on CLOUDSTACK-9595:
--------------------------------------------

Github user serg38 commented on the issue:

    https://github.com/apache/cloudstack/pull/1762
  
    Here it is few samples of deadlocks we observe in high transaction volume environment
with multiple management servers. As you can see most of them are concurrent operations from
different management servers and either select or select for update statements. The following
4 types account for the majority of deadlock s we saw so far ( 80-90% of all deadlocks). Deadlock
1-3 happens much more often than deadlock 4.  It is next to impossible to reproduce since
they occur one in few days with 4 management servers and average VM deployment volume of 3000
a day.
    
    Deadlock type 1:
    
    InnoDB: transactions deadlock detected, dumping detailed information.
    151217  3:08:20
    *** (1) TRANSACTION:
    TRANSACTION BB4D4C91D, ACTIVE 0 sec fetching rows
    mysql tables in use 1, locked 1
    LOCK WAIT 11 lock struct(s), heap size 3112, 5 row lock(s)
    MySQL thread id 47654, OS thread handle 0x7f0475bdd700, query id 3821358107 ussclpdcsmgt012.autodesk.com
10.41.13.14 cloud Sending data
    SELECT host.id, host.disconnected, host.name, host.status, host.type, host.private_ip_address,
host.private_mac_address, host.private_netmask, host.public_netmask, host.public_ip_address,
host.public_mac_address, host.storage_ip_address, host.cluster_id, host.storage_netmask, host.storage_mac_address,
host.storage_ip_address_2, host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type,
host.proxy_port, host.resource, host.fs_type, host.available, host.setup, host.resource_state,
host.hypervisor_version, host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets,
host.cpus, host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size,
host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host WHERE host.resource IS NOT NULL  AND host.mgmt_server_id = 345048964870 
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    *** (2) TRANSACTION:
    TRANSACTION BB4D4C915, ACTIVE 1 sec fetching rows, thread declared inside InnoDB 449
    mysql tables in use 3, locked 3
    29 lock struct(s), heap size 6960, 15 row lock(s), undo log entries 1
    MySQL thread id 47623, OS thread handle 0x7f0a47074700, query id 3821724056 ussclpdcsmgt013.autodesk.com
10.41.13.15 cloud Copying to tmp table
    SELECT host.id, host.disconnected, host.name, host.status, host.type, host.private_ip_address,
host.private_mac_address, host.private_netmask, host.public_netmask, host.public_ip_address,
host.public_mac_address, host.storage_ip_address, host.cluster_id, host.storage_netmask, host.storage_mac_address,
host.storage_ip_address_2, host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type,
host.proxy_port, host.resource, host.fs_type, host.available, host.setup, host.resource_state,
host.hypervisor_version, host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets,
host.cpus, host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size,
host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host  LEFT OUTER JOIN op_host_transfer ON host.id=op_host_transfer.id
    *** (2) HOLDS THE LOCK(S):
    RECORD LOCKS space id 0 page no 147488 n bits 840 index `i_host__removed` of table `cloud`.`host`
trx id BB4D4C915 lock_mode X locks rec but not gap
    
    
    Deadlock 2:
    
    InnoDB: transactions deadlock detected, dumping detailed information.
    151218 11:03:00
    *** (1) TRANSACTION:
    TRANSACTION BBB232C81, ACTIVE 51 sec starting index read
    mysql tables in use 1, locked 1
    LOCK WAIT 3 lock struct(s), heap size 1248, 2 row lock(s)
    MySQL thread id 57308, OS thread handle 0x7f0a45c24700, query id 5217973695 ussclpdcsmgt013.autodesk.com
10.41.13.15 cloud Sending data
    SELECT resource_count.id, resource_count.type, resource_count.account_id, resource_count.domain_id,
resource_count.count FROM resource_count WHERE resource_count.id IN (5083,4867,5079,33652,5077)
 FOR UPDATE
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    *** (2) TRANSACTION:
    TRANSACTION BBB2254AC, ACTIVE 116 sec starting index read, thread declared inside InnoDB
500
    mysql tables in use 1, locked 1
    207 lock struct(s), heap size 31160, 1650 row lock(s), undo log entries 2
    MySQL thread id 56926, OS thread handle 0x7f04756c9700, query id 5218549710 ussclpdcsmgt014.autodesk.com
10.41.13.16 cloud Sending data
    SELECT resource_count.id, resource_count.type, resource_count.account_id, resource_count.domain_id,
resource_count.count FROM resource_count WHERE resource_count.id IN (5083,4867,5079,33652,5077)
 FOR UPDATE
    
    Deadlock 3:
    
    ** (1) TRANSACTION:
    TRANSACTION BBB232C81, ACTIVE 51 sec starting index read
    mysql tables in use 1, locked 1
    LOCK WAIT 3 lock struct(s), heap size 1248, 2 row lock(s)
    MySQL thread id 57308, OS thread handle 0x7f0a45c24700, query id 5217973695 ussclpdcsmgt013.autodesk.com
10.41.13.15 cloud Sending data
    SELECT resource_count.id, resource_count.type, resource_count.account_id, resource_count.domain_id,
resource_count.count FROM resource_count WHERE resource_count.id IN (5083,4867,5079,33652,5077)
 FOR UPDATE
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    *** (2) TRANSACTION:
    TRANSACTION BBB2254AC, ACTIVE 116 sec starting index read, thread declared inside InnoDB
500
    mysql tables in use 1, locked 1
    207 lock struct(s), heap size 31160, 1650 row lock(s), undo log entries 2
    MySQL thread id 56926, OS thread handle 0x7f04756c9700, query id 5218549710 ussclpdcsmgt014.autodesk.com
10.41.13.16 cloud Sending data
    SELECT resource_count.id, resource_count.type, resource_count.account_id, resource_count.domain_id,
resource_count.count FROM resource_count WHERE resource_count.id IN (5083,4867,5079,33652,5077)
 FOR UPDATE
    *** (2) HOLDS THE LOCK(S):
    
    Deadlock 4:
    *** (1) TRANSACTION:
    TRANSACTION C3BDD81EF, ACTIVE 0 sec fetching rows
    mysql tables in use 1, locked 1
    LOCK WAIT 55 lock struct(s), heap size 6960, 3 row lock(s), undo log entries 1
    MySQL thread id 250487, OS thread handle 0x7f0a460b6700, query id 32833273614 ussclpdcsmgt013.autodesk.com
10.41.13.15 cloud updating
    DELETE FROM vm_reservation WHERE vm_reservation.vm_id = 869089
    *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
    *** (2) TRANSACTION:
    TRANSACTION C3BDD81FC, ACTIVE 0 sec inserting, thread declared inside InnoDB 500
    mysql tables in use 1, locked 1
    4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 2
    MySQL thread id 250553, OS thread handle 0x7f0a3720c700, query id 32833273762 ussclpdcsmgt013.autodesk.com
10.41.13.15 cloud update
    INSERT INTO volume_reservation (volume_reservation.vm_reservation_id, volume_reservation.vm_id,
volume_reservation.volume_id, volume_reservation.pool_id) VALUES (997419, 1009484, 918449,
316)
    *** (2) HOLDS THE LOCK(S):



> Transactions are not getting retried in case of database deadlock errors
> ------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9595
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>    Affects Versions: 4.8.0
>            Reporter: subhash yedugundla
>             Fix For: 4.8.1
>
>
> Customer is seeing occasional error 'Deadlock found when trying to get lock; try restarting
transaction' messages in their management server logs.  It happens regularly at least once
a day.  The following is the error seen 
> 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] (catalina-exec-3:ctx-f05c58fc ctx-39c17156
ctx-7becdf6e) unhandled exception executing api command: [Ljava.lang.String;@230a6e7f
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@74f134e3:
DELETE FROM instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374
> 	at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209)
> 	at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> 	at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> 	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> 	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> 	at com.sun.proxy.$Proxy237.expunge(Unknown Source)
> 	at com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593)
> 	at com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> 	at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57)
> 	at com.cloud.utils.db.Transaction.execute(Transaction.java:45)
> 	at com.cloud.utils.db.Transaction.execute(Transaction.java:54)
> 	at com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575)
> 	at com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message