ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 33158: Ambari Server Deadlock When Mapping Hosts
Date Tue, 14 Apr 2015 15:01:30 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33158/
-----------------------------------------------------------

(Updated April 14, 2015, 11:01 a.m.)


Review request for Ambari, Myroslav Papirkovskyy, Nate Cole, Sid Wagle, and Tom Beerbower.


Bugs: AMBARI-10456
    https://issues.apache.org/jira/browse/AMBARI-10456


Repository: ambari


Description
-------

When mapping hosts concurrently with reading information from a cluster, there was a deadlock
between the building the cluster health report and mapping the new hosts. 

A few changes to note here:

- ClustersImpl uses concurrent maps; there's really no need to keep the internal lock. I removed
it in several places where the cluster is guaranteed to be available (such as when using the
ID to retrieve the cluster). The concurrent maps guard against concurrent modifications.

- The Ambari Event Publish was actually synchronous. This not only caused bottlenecks, but
also contributed to a secondary deadlock detected while fixing the original issue. It was
changed into a single-threaded asynchronous bus. Consumers of this bus should never rely on
it to perform its actions in order to perform their own logic, so changing the behavior seemed
correct


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/events/publishers/AmbariEventPublisher.java
96e66a62 
  ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java 9cf1f5a

  ambari-server/src/main/java/org/apache/ambari/server/state/services/AmbariServerAlertService.java
89f9656 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java fe4ba60

  ambari-server/src/test/java/org/apache/ambari/server/events/EventsTest.java 09c335a 
  ambari-server/src/test/java/org/apache/ambari/server/orm/dao/AlertDispatchDAOTest.java 8768ffc

  ambari-server/src/test/java/org/apache/ambari/server/orm/dao/AlertsDAOTest.java 9c8ea7d

  ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertEventPublisherTest.java
19b5d46 
  ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertStateChangedEventTest.java
b64afed 
  ambari-server/src/test/java/org/apache/ambari/server/state/alerts/InitialAlertEventTest.java
4e55c49 
  ambari-server/src/test/java/org/apache/ambari/server/state/cluster/AlertDataManagerTest.java
acf7911 
  ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterDeadlockTest.java
766105d 
  ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClustersDeadlockTest.java
839b25f 
  ambari-server/src/test/java/org/apache/ambari/server/utils/EventBusSynchronizer.java PRE-CREATION


Diff: https://reviews.apache.org/r/33158/diff/


Testing
-------

New tests written to cover the deadlocks; existing tests updated for the async event bus.


Thanks,

Jonathan Hurley


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message