ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Review Request 36779: Ambari Cluster Deployment Stuck At 2% With A SQL Deadlock When Talking to SQL Azure
Date Fri, 24 Jul 2015 13:06:00 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36779/
-----------------------------------------------------------

Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sumit Mohanty.


Bugs: AMBARI-12526
    https://issues.apache.org/jira/browse/AMBARI-12526


Repository: ambari


Description
-------

When deploying a new cluster on SQL Azure, there is a recurring deadlock on the SQL Server.


Essentially, we have concurrent UPDATE statements in separate transactions acting on different
rows of hostcomponentstate. This seems to cause a deadlock because both processes have an
X lock and then try to acquire a U lock. The U lock is what is making me think they are trying
to acquire the table lock in order to update the cluster index.

The solution was to:
- Ensure that some of the failing transactions were placed within the scope of our internal
Java locks
- flush writing to the problem table


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListener.java
c016cbd 
  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostComponentStateDAO.java
00ffd5a 
  ambari-server/src/main/java/org/apache/ambari/server/state/Host.java 7a53c21 
  ambari-server/src/main/java/org/apache/ambari/server/state/Service.java 1137cba 
  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponent.java 60a16eb

  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentHost.java 6917a15

  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java aa147de

  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceImpl.java 6484c9f 
  ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java 2b3bf05

  ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java 90fdbec

  ambari-server/src/main/java/org/apache/ambari/server/state/configgroup/ConfigGroupImpl.java
a01f4d4 
  ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java e59f4aa 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java
b623479 
  ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ServiceComponentHostConcurrentWriteDeadlockTest.java
PRE-CREATION 

Diff: https://reviews.apache.org/r/36779/diff/


Testing
-------

Deployed on SQL Azure about 50 times and did not see the deadlock occur. It would normally
occur in the first 5 cluster deployments.


Thanks,

Jonathan Hurley


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message