ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 36779: Ambari Cluster Deployment Stuck At 2% With A SQL Deadlock When Talking to SQL Azure
Date Fri, 24 Jul 2015 17:31:40 GMT


> On July 24, 2015, 1:27 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java,
line 530
> > <https://reviews.apache.org/r/36779/diff/1/?file=1020965#file1020965line530>
> >
> >     Was this what caused the deadlock?

One of them; the transaction needs to complete within the scope of the lock.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36779/#review92932
-----------------------------------------------------------


On July 24, 2015, 9:05 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36779/
> -----------------------------------------------------------
> 
> (Updated July 24, 2015, 9:05 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-12526
>     https://issues.apache.org/jira/browse/AMBARI-12526
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When deploying a new cluster on SQL Azure, there is a recurring deadlock on the SQL Server.

> 
> Essentially, we have concurrent UPDATE statements in separate transactions acting on
different rows of hostcomponentstate. This seems to cause a deadlock because both processes
have an X lock and then try to acquire a U lock. The U lock is what is making me think they
are trying to acquire the table lock in order to update the cluster index.
> 
> The solution was to:
> - Ensure that some of the failing transactions were placed within the scope of our internal
Java locks
> - flush writing to the problem table
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListener.java
c016cbd 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostComponentStateDAO.java
00ffd5a 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Host.java 7a53c21 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Service.java 1137cba 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponent.java 60a16eb

>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentHost.java
6917a15 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java
aa147de 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceImpl.java 6484c9f

>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
2b3bf05 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
90fdbec 
>   ambari-server/src/main/java/org/apache/ambari/server/state/configgroup/ConfigGroupImpl.java
a01f4d4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java e59f4aa

>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java
b623479 
>   ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ServiceComponentHostConcurrentWriteDeadlockTest.java
PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36779/diff/
> 
> 
> Testing
> -------
> 
> Deployed on SQL Azure about 50 times and did not see the deadlock occur. It would normally
occur in the first 5 cluster deployments.
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message