ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-12178) Memory Exhausted During Upgrade Of Large Cluster
Date Sun, 28 Jun 2015 01:28:04 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604458#comment-14604458
] 

Hudson commented on AMBARI-12178:
---------------------------------

FAILURE: Integrated in Ambari-trunk-Commit #3021 (See [https://builds.apache.org/job/Ambari-trunk-Commit/3021/])
AMBARI-12178 - Memory Exhausted During Upgrade Of Large Cluster (part2) (jonathanhurley) (jhurley:
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=b079040d89e92a14dfd1ba51cda22b34f661f1e6)
* ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java


> Memory Exhausted During Upgrade Of Large Cluster
> ------------------------------------------------
>
>                 Key: AMBARI-12178
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12178
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Blocker
>             Fix For: 2.1.0
>
>
> During an upgrade of a large cluster, the memory used by Ambari grows until it is fully
consumed. This, however, only happens when the Upgrade Dialog page is open. If that popup
is closed, the memory usage stays relatively constant.
> The offending call is:
> {code}
> api/v1/clusters/perf400/upgrades/31?upgrade_groups/UpgradeGroup/status!=PENDING&fields=Upgrade/progress_percent,Upgrade/request_context,Upgrade/request_status,Upgrade/direction,upgrade_groups/UpgradeGroup,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/context,upgrade_groups/upgrade_items/UpgradeItem/group_id,upgrade_groups/upgrade_items/UpgradeItem/progress_percent,upgrade_groups/upgrade_items/UpgradeItem/request_id,upgrade_groups/upgrade_items/UpgradeItem/skippable,upgrade_groups/upgrade_items/UpgradeItem/stage_id,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/text&minimal_response=true
> {code}
> Based on heap dumps, the larges offenders are {{StageEnity}} and, as a result, {{byte[]}}:
> {noformat}
> Class Name| Objects |  Shallow Heap | Retained Heap
> ----------------------------------------------------
> byte[]    | 351,907 | 3,147,710,224 |              
> ----------------------------------------------------
> Class Name                                         | Objects | Shallow Heap | Retained
Heap
> --------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity  | 192,356 |   18,466,176 | 3,075,693,136
> org.apache.ambari.server.orm.entities.StageEntity_ |       0 |            0 |       
      
> org.apache.ambari.server.orm.entities.StageEntityPK|       0 |            0 |       
      
> --------------------------------------------------------------------------------------------
> {noformat}
> Each {{StageEntity}} is holding about 30k:
> {noformat}
> Class Name                                                                          
                                                                                         
                                                                                         
                                        | Shallow Heap | Retained Heap
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity @ 0x738e03260                     
                                                                                         
                                                                                         
                                        |           96 |        28,576
> |- <class> class org.apache.ambari.server.orm.entities.StageEntity @ 0x64058d268
                                                                                         
                                                                                         
                                            |            8 |             8
> |- skippable java.lang.Integer @ 0x6401e9738  0                                     
                                                                                         
                                                                                         
                                        |           16 |            16
> |- clusterId java.lang.Long @ 0x64026c908  2                                        
                                                                                         
                                                                                         
                                        |           24 |            24
> |- requestId java.lang.Long @ 0x64026d840  31                                       
                                                                                         
                                                                                         
                                        |           24 |            24
> |- _persistence_primaryKey org.eclipse.persistence.internal.identitymaps.CacheId @ 0x642ce20e0
                                                                                         
                                                                                         
                              |           24 |            48
> |- _persistence_cacheKey org.eclipse.persistence.internal.identitymaps.HardCacheWeakIdentityMap$ReferenceCacheKey
@ 0x6469cf328                                                                            
                                                                                         
           |          104 |           136
> |- request org.apache.ambari.server.orm.entities.RequestEntity @ 0x728d046e8        
                                                                                         
                                                                                         
                                        |          112 |           432
> |- _persistence_listener org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
@ 0x72f073f20                                                                            
                                                                                         
                |           32 |            32
> |- stageId java.lang.Long @ 0x7350c8b08  1199                                       
                                                                                         
                                                                                         
                                        |           24 |            24
> |- logInfo java.lang.String @ 0x7350c8b20  /tmp/ambari                              
                                                                                         
                                                                                         
                                        |           24 |            64
> |- requestContext java.lang.String @ 0x7350c8b38  Restarting DataNode on perf400-c-371.c.pramod-thangali.internal
                                                                                         
                                                                                         
           |           24 |           168
> |- hostRoleCommands org.eclipse.persistence.indirection.IndirectList @ 0x738a0ceb0  
                                                                                         
                                                                                         
                                        |           64 |           184
> |- roleSuccessCriterias org.eclipse.persistence.indirection.IndirectList @ 0x738a0cef0
                                                                                         
                                                                                         
                                      |           64 |           184
> |- commandParamsStage byte[141] @ 0x738c46cc8  {"restart_type":"rolling_upgrade","upgrade_direction":"upgrade","version":"2.2.6.0-2799","target_stack":"HDP-2.2","original_stack":"HDP-2.2"}
                                                                                         
                          |          160 |           160
> |- hostParamsStage byte[776] @ 0x738dc16b0  {"ambari_db_rca_driver":"org.postgresql.Driver","ambari_db_rca_password":"mapred","ambari_db_rca_url":"jdbc:postgresql://perf400-a-1.c.pramod-thangali.internal/ambarirca","ambari_db_rca_username":"mapred","current_version":"2.2.0.0-2041","db_driver_filenam...
 |          792 |           792
> |- clusterHostInfo byte[26774] @ 0x739006378  {"nimbus_hosts":["278"],"all_racks":["/default-rack:0-405"],"ambari_server_host":["perf400-a-1.c.pramod-thangali.internal"],"app_timeline_server_hosts":["138"],"hive_mysql_host":["247"],"falcon_server_hosts":["2"],"hbase_master_hosts":["2"],"accumulo_maste...|
      26,792 |        26,792
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}
> It appears as though a local {{Cache}} in [ActionDBAccessorImpl|https://github.com/apache/ambari/blob/94c091e280a99e07db5f3910873e70aa3c18394f/ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java#L104]
is holding on these objects:
> {noformat:title=Shows the cache holding onto a HostEntity which holds onto a UnitOfWork
map with lots of stale entities}
> Class Name                                                                          
                                                                      | Ref. Objects | Shallow
Heap | Ref. Shallow Heap | Retained Heap
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> java.lang.Thread @ 0x641af65b8  ambari-action-scheduler Native Stack, Thread        
                                                                      |           76 |   
      120 |             7,296 |     4,960,776
> |- <Java Local> org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$$EnhancerByGuice$$dcf333e8
@ 0x640538f40                                       |           75 |          248 |      
      7,200 |   640,497,232
> |  '- hostRoleCommandCache com.google.common.cache.LocalCache$LocalManualCache @ 0x640474b58
                                                              |           75 |           16
|             7,200 |   640,496,984
> |     '- localCache com.google.common.cache.LocalCache @ 0x640da1650                
                                                                      |           75 |   
      128 |             7,200 |   640,496,968
> |        '- segments com.google.common.cache.LocalCache$Segment[4] @ 0x640f27e88    
                                                                      |           75 |   
       32 |             7,200 |   640,496,840
> |           |- [1] com.google.common.cache.LocalCache$Segment @ 0x6410ee3c8         
                                                                      |           22 |   
       80 |             2,112 |   151,456,800
> |           |  |- table java.util.concurrent.atomic.AtomicReferenceArray @ 0x6470826f8
                                                                    |           21 |     
     16 |             2,016 |         2,080
> |           |  |  '- array java.lang.Object[512] @ 0x65dd9e088                      
                                                                      |           21 |   
    2,064 |             2,016 |         2,064
> |           |  |     |- [346] com.google.common.cache.LocalCache$StrongAccessEntry @
0x670caa3d0                                                           |            1 |   
       48 |                96 |     2,854,000
> |           |  |     |  '- valueReference com.google.common.cache.LocalCache$StrongValueReference
@ 0x670caa418                                            |            1 |           16 | 
              96 |     2,853,928
> |           |  |     |     '- referent org.apache.ambari.server.actionmanager.HostRoleCommand
@ 0x670caa430                                                |            1 |          128
|                96 |     2,853,912
> |           |  |     |        '- hostEntity org.apache.ambari.server.orm.entities.HostEntity
@ 0x66f876d18                                                 |            1 |          136
|                96 |     2,827,496
> |           |  |     |           '- _persistence_listener org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
@ 0x66f89f530|            1 |           32 |                96 |            32
> |           |  |     |              '- uow org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork
@ 0x670ca0b30                               |            1 |          360 |              
 96 |     2,826,496
> |           |  |     |                 '- identityMapAccessor org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor
@ 0x66f7fbf38        |            1 |           24 |                96 |     2,825,688
> |           |  |     |                    '- identityMapManager org.eclipse.persistence.internal.identitymaps.IdentityMapManager
@ 0x670c2b320             |            1 |           48 |                96 |     2,825,664
> |           |  |     |                       '- identityMaps java.util.HashMap @ 0x670c2b350
                                                              |            1 |           48
|                96 |     2,824,208
> |           |  |     |                          '- table java.util.HashMap$Node[32] @
0x670cb1608                                                          |            1 |    
     144 |                96 |     2,824,160
> |           |  |     |                             '- [5] java.util.HashMap$Node @ 0x670b71bd8
                                                            |            1 |           32
|                96 |     1,201,192
> |           |  |     |                                '- value org.eclipse.persistence.internal.identitymaps.UnitOfWorkIdentityMap
@ 0x670c5a390           |            1 |           32 |                96 |     1,201,160
> |           |  |     |                                   '- cacheKeys java.util.HashMap
@ 0x670c2b4d0                                                      |            1 |      
    48 |                96 |     1,201,128
> |           |  |     |                                      '- table java.util.HashMap$Node[4096]
@ 0x66f7c83c8                                            |            1 |       16,400 | 
              96 |     1,201,080
> |           |  |     |                                         '- [3271] java.util.HashMap$Node
@ 0x670c772e8                                              |            1 |           32 |
               96 |           200
> |           |  |     |                                            '- value org.eclipse.persistence.internal.identitymaps.CacheKey
@ 0x66f756e30            |            1 |           96 |                96 |            96
> |           |  |     |                                               '- object org.apache.ambari.server.orm.entities.StageEntity
@ 0x66f4f6f98             |            1 |           96 |                96 |           568
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message