Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6BFED1805C for ; Fri, 26 Jun 2015 21:57:16 +0000 (UTC) Received: (qmail 60839 invoked by uid 500); 26 Jun 2015 21:57:16 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 60803 invoked by uid 500); 26 Jun 2015 21:57:16 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 60785 invoked by uid 99); 26 Jun 2015 21:57:16 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jun 2015 21:57:16 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id DD846AB74F; Fri, 26 Jun 2015 21:57:14 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============2606715727569313343==" MIME-Version: 1.0 Subject: Re: Review Request 35945: Memory Exhausted During Upgrade Of Large Cluster From: "Jonathan Hurley" To: "Nate Cole" , "Tom Beerbower" Cc: "Ambari" , "Jonathan Hurley" Date: Fri, 26 Jun 2015 21:57:14 -0000 Message-ID: <20150626215714.3113.25529@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Jonathan Hurley" X-ReviewGroup: Ambari X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/35945/ X-Sender: "Jonathan Hurley" References: <20150626215704.3114.34055@reviews.apache.org> In-Reply-To: <20150626215704.3114.34055@reviews.apache.org> Reply-To: "Jonathan Hurley" X-ReviewRequest-Repository: ambari --===============2606715727569313343== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35945/ ----------------------------------------------------------- (Updated June 26, 2015, 5:57 p.m.) Review request for Ambari, Nate Cole and Tom Beerbower. Bugs: AMBARI-12178 https://issues.apache.org/jira/browse/AMBARI-12178 Repository: ambari Description ------- During an upgrade of a large cluster, the memory used by Ambari grows until it is fully consumed. This, however, only happens when the Upgrade Dialog page is open. If that popup is closed, the memory usage stays relatively constant. Based on heap dumps, the larges offenders are StageEnity and, as a result, byte[] Long story short here is that we have a cache in ActionDBAccessorImpl that holds onto entities. Because of this, the underlying UnitOfWork map is never released and holds onto all StageEntity instances. Eventually, items are purged from this cache, but it's not fast enough to free up memory resources. Without ripping apart Ambari or making dangerous cache reference changes, the easiest solution was to ensure that the fields causing the StageEntity to be large were lazy loaded since most of the time these entities are just sitting around in the EntityManager. Diffs ----- ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StageResourceProvider.java 664fae3 ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeGroupResourceProvider.java eb34d63 ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java b354841 ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostEntity.java 9f3f70c ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java c2b97d6 ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java 31363b4 Diff: https://reviews.apache.org/r/35945/diff/ Testing ------- Performed a cluster upgrade and monitored the memory usage. 200,000 StageEntity used to occupy 3.5GB of heap; now they take up about 150MB. Thanks, Jonathan Hurley --===============2606715727569313343==--