ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <>
Subject Re: Review Request 35945: Memory Exhausted During Upgrade Of Large Cluster
Date Sat, 27 Jun 2015 02:29:50 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated June 26, 2015, 10:29 p.m.)

Review request for Ambari, Mahadev Konar, Nate Cole, Sumit Mohanty, and Tom Beerbower.

Bugs: AMBARI-12178

Repository: ambari


During an upgrade of a large cluster, the memory used by Ambari grows until it is fully consumed.
This, however, only happens when the Upgrade Dialog page is open. If that popup is closed,
the memory usage stays relatively constant. Based on heap dumps, the larges offenders are
StageEnity and, as a result, byte[]

Long story short here is that we have a cache in ActionDBAccessorImpl that holds onto entities.
Because of this, the underlying UnitOfWork map is never released and holds onto all StageEntity
instances. Eventually, items are purged from this cache, but it's not fast enough to free
up memory resources.

Without ripping apart Ambari or making dangerous cache reference changes, the easiest solution
was to ensure that the fields causing the StageEntity to be large were lazy loaded since most
of the time these entities are just sitting around in the EntityManager.


  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/ b354841 
  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/ 9f3f70c

  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/ c2b97d6

  ambari-server/src/main/java/org/apache/ambari/server/topology/ f63ba3f 
  ambari-server/src/main/java/org/apache/ambari/server/topology/ 31363b4


Testing (updated)

Performed a cluster upgrade and monitored the memory usage. 200,000 StageEntity used to occupy
3.5GB of heap; now they take up about 150MB.

Tests run: 3099, Failures: 0, Errors: 0, Skipped: 28

[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:13 min
[INFO] Finished at: 2015-06-26T21:32:27-04:00
[INFO] Final Memory: 46M/1414M
[INFO] ------------------------------------------------------------------------


Jonathan Hurley

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message