ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-12178) Memory Exhausted During Upgrade Of Large Cluster
Date Sat, 27 Jun 2015 06:49:04 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604030#comment-14604030
] 

Hudson commented on AMBARI-12178:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.1 #128 (See [https://builds.apache.org/job/Ambari-branch-2.1/128/])
AMBARI-12178 - Memory Exhausted During Upgrade Of Large Cluster (jonathanhurley) (jhurley:
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=77316937c92ba3465255bac5acd335317f58bdd7)
* ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StageResourceProvider.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostEntity.java
* ambari-server/src/main/java/org/apache/ambari/server/topology/HostRequest.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
* ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeGroupResourceProvider.java


> Memory Exhausted During Upgrade Of Large Cluster
> ------------------------------------------------
>
>                 Key: AMBARI-12178
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12178
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Blocker
>             Fix For: 2.1.0
>
>
> During an upgrade of a large cluster, the memory used by Ambari grows until it is fully
consumed. This, however, only happens when the Upgrade Dialog page is open. If that popup
is closed, the memory usage stays relatively constant.
> The offending call is:
> {code}
> api/v1/clusters/perf400/upgrades/31?upgrade_groups/UpgradeGroup/status!=PENDING&fields=Upgrade/progress_percent,Upgrade/request_context,Upgrade/request_status,Upgrade/direction,upgrade_groups/UpgradeGroup,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/context,upgrade_groups/upgrade_items/UpgradeItem/group_id,upgrade_groups/upgrade_items/UpgradeItem/progress_percent,upgrade_groups/upgrade_items/UpgradeItem/request_id,upgrade_groups/upgrade_items/UpgradeItem/skippable,upgrade_groups/upgrade_items/UpgradeItem/stage_id,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/text&minimal_response=true
> {code}
> Based on heap dumps, the larges offenders are {{StageEnity}} and, as a result, {{byte[]}}:
> {noformat}
> Class Name| Objects |  Shallow Heap | Retained Heap
> ----------------------------------------------------
> byte[]    | 351,907 | 3,147,710,224 |              
> ----------------------------------------------------
> Class Name                                         | Objects | Shallow Heap | Retained
Heap
> --------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity  | 192,356 |   18,466,176 | 3,075,693,136
> org.apache.ambari.server.orm.entities.StageEntity_ |       0 |            0 |       
      
> org.apache.ambari.server.orm.entities.StageEntityPK|       0 |            0 |       
      
> --------------------------------------------------------------------------------------------
> {noformat}
> Each {{StageEntity}} is holding about 30k:
> {noformat}
> Class Name                                                                          
                                                                                         
                                                                                         
                                        | Shallow Heap | Retained Heap
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity @ 0x738e03260                     
                                                                                         
                                                                                         
                                        |           96 |        28,576
> |- <class> class org.apache.ambari.server.orm.entities.StageEntity @ 0x64058d268
                                                                                         
                                                                                         
                                            |            8 |             8
> |- skippable java.lang.Integer @ 0x6401e9738  0                                     
                                                                                         
                                                                                         
                                        |           16 |            16
> |- clusterId java.lang.Long @ 0x64026c908  2                                        
                                                                                         
                                                                                         
                                        |           24 |            24
> |- requestId java.lang.Long @ 0x64026d840  31                                       
                                                                                         
                                                                                         
                                        |           24 |            24
> |- _persistence_primaryKey org.eclipse.persistence.internal.identitymaps.CacheId @ 0x642ce20e0
                                                                                         
                                                                                         
                              |           24 |            48
> |- _persistence_cacheKey org.eclipse.persistence.internal.identitymaps.HardCacheWeakIdentityMap$ReferenceCacheKey
@ 0x6469cf328                                                                            
                                                                                         
           |          104 |           136
> |- request org.apache.ambari.server.orm.entities.RequestEntity @ 0x728d046e8        
                                                                                         
                                                                                         
                                        |          112 |           432
> |- _persistence_listener org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
@ 0x72f073f20                                                                            
                                                                                         
                |           32 |            32
> |- stageId java.lang.Long @ 0x7350c8b08  1199                                       
                                                                                         
                                                                                         
                                        |           24 |            24
> |- logInfo java.lang.String @ 0x7350c8b20  /tmp/ambari                              
                                                                                         
                                                                                         
                                        |           24 |            64
> |- requestContext java.lang.String @ 0x7350c8b38  Restarting DataNode on perf400-c-371.c.pramod-thangali.internal
                                                                                         
                                                                                         
           |           24 |           168
> |- hostRoleCommands org.eclipse.persistence.indirection.IndirectList @ 0x738a0ceb0  
                                                                                         
                                                                                         
                                        |           64 |           184
> |- roleSuccessCriterias org.eclipse.persistence.indirection.IndirectList @ 0x738a0cef0
                                                                                         
                                                                                         
                                      |           64 |           184
> |- commandParamsStage byte[141] @ 0x738c46cc8  {"restart_type":"rolling_upgrade","upgrade_direction":"upgrade","version":"2.2.6.0-2799","target_stack":"HDP-2.2","original_stack":"HDP-2.2"}
                                                                                         
                          |          160 |           160
> |- hostParamsStage byte[776] @ 0x738dc16b0  {"ambari_db_rca_driver":"org.postgresql.Driver","ambari_db_rca_password":"mapred","ambari_db_rca_url":"jdbc:postgresql://perf400-a-1.c.pramod-thangali.internal/ambarirca","ambari_db_rca_username":"mapred","current_version":"2.2.0.0-2041","db_driver_filenam...
 |          792 |           792
> |- clusterHostInfo byte[26774] @ 0x739006378  {"nimbus_hosts":["278"],"all_racks":["/default-rack:0-405"],"ambari_server_host":["perf400-a-1.c.pramod-thangali.internal"],"app_timeline_server_hosts":["138"],"hive_mysql_host":["247"],"falcon_server_hosts":["2"],"hbase_master_hosts":["2"],"accumulo_maste...|
      26,792 |        26,792
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}
> It appears as though a local {{Cache}} in [ActionDBAccessorImpl|https://github.com/apache/ambari/blob/94c091e280a99e07db5f3910873e70aa3c18394f/ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java#L104]
is holding on these objects:
> {noformat:title=Shows the cache holding onto a HostEntity which holds onto a UnitOfWork
map with lots of stale entities}
> Class Name                                                                          
                                                                      | Ref. Objects | Shallow
Heap | Ref. Shallow Heap | Retained Heap
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> java.lang.Thread @ 0x641af65b8  ambari-action-scheduler Native Stack, Thread        
                                                                      |           76 |   
      120 |             7,296 |     4,960,776
> |- <Java Local> org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$$EnhancerByGuice$$dcf333e8
@ 0x640538f40                                       |           75 |          248 |      
      7,200 |   640,497,232
> |  '- hostRoleCommandCache com.google.common.cache.LocalCache$LocalManualCache @ 0x640474b58
                                                              |           75 |           16
|             7,200 |   640,496,984
> |     '- localCache com.google.common.cache.LocalCache @ 0x640da1650                
                                                                      |           75 |   
      128 |             7,200 |   640,496,968
> |        '- segments com.google.common.cache.LocalCache$Segment[4] @ 0x640f27e88    
                                                                      |           75 |   
       32 |             7,200 |   640,496,840
> |           |- [1] com.google.common.cache.LocalCache$Segment @ 0x6410ee3c8         
                                                                      |           22 |   
       80 |             2,112 |   151,456,800
> |           |  |- table java.util.concurrent.atomic.AtomicReferenceArray @ 0x6470826f8
                                                                    |           21 |     
     16 |             2,016 |         2,080
> |           |  |  '- array java.lang.Object[512] @ 0x65dd9e088                      
                                                                      |           21 |   
    2,064 |             2,016 |         2,064
> |           |  |     |- [346] com.google.common.cache.LocalCache$StrongAccessEntry @
0x670caa3d0                                                           |            1 |   
       48 |                96 |     2,854,000
> |           |  |     |  '- valueReference com.google.common.cache.LocalCache$StrongValueReference
@ 0x670caa418                                            |            1 |           16 | 
              96 |     2,853,928
> |           |  |     |     '- referent org.apache.ambari.server.actionmanager.HostRoleCommand
@ 0x670caa430                                                |            1 |          128
|                96 |     2,853,912
> |           |  |     |        '- hostEntity org.apache.ambari.server.orm.entities.HostEntity
@ 0x66f876d18                                                 |            1 |          136
|                96 |     2,827,496
> |           |  |     |           '- _persistence_listener org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
@ 0x66f89f530|            1 |           32 |                96 |            32
> |           |  |     |              '- uow org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork
@ 0x670ca0b30                               |            1 |          360 |              
 96 |     2,826,496
> |           |  |     |                 '- identityMapAccessor org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor
@ 0x66f7fbf38        |            1 |           24 |                96 |     2,825,688
> |           |  |     |                    '- identityMapManager org.eclipse.persistence.internal.identitymaps.IdentityMapManager
@ 0x670c2b320             |            1 |           48 |                96 |     2,825,664
> |           |  |     |                       '- identityMaps java.util.HashMap @ 0x670c2b350
                                                              |            1 |           48
|                96 |     2,824,208
> |           |  |     |                          '- table java.util.HashMap$Node[32] @
0x670cb1608                                                          |            1 |    
     144 |                96 |     2,824,160
> |           |  |     |                             '- [5] java.util.HashMap$Node @ 0x670b71bd8
                                                            |            1 |           32
|                96 |     1,201,192
> |           |  |     |                                '- value org.eclipse.persistence.internal.identitymaps.UnitOfWorkIdentityMap
@ 0x670c5a390           |            1 |           32 |                96 |     1,201,160
> |           |  |     |                                   '- cacheKeys java.util.HashMap
@ 0x670c2b4d0                                                      |            1 |      
    48 |                96 |     1,201,128
> |           |  |     |                                      '- table java.util.HashMap$Node[4096]
@ 0x66f7c83c8                                            |            1 |       16,400 | 
              96 |     1,201,080
> |           |  |     |                                         '- [3271] java.util.HashMap$Node
@ 0x670c772e8                                              |            1 |           32 |
               96 |           200
> |           |  |     |                                            '- value org.eclipse.persistence.internal.identitymaps.CacheKey
@ 0x66f756e30            |            1 |           96 |                96 |            96
> |           |  |     |                                               '- object org.apache.ambari.server.orm.entities.StageEntity
@ 0x66f4f6f98             |            1 |           96 |                96 |           568
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message