ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Sposetti (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-4133) Perf issues on Hosts page - freezes for several seconds and then unfreezes repeatedly on a large cluster
Date Fri, 20 Dec 2013 12:38:09 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Sposetti updated AMBARI-4133:
----------------------------------

    Description: 
On a 600-node cluster, the Hosts page hangs for about 5 seconds and then unblocks for about
10 seconds, then freezes for 5 seconds, etc.
Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the CPU.  This is
called by App.MainHostView's *hostCounts*, which is called by App.MainHostView's *label*.
 This seems to be the cause for this freeze/unfreeze behavior. 

{code}
    criticalAlertsCount: function () {
      return App.router.get('clusterController.alerts').filterProperty('hostName', this.get('hostName')).filterProperty('isOk',
false).filterProperty('ignoredForHosts', false).length;
    }.property('App.router.clusterController.alerts.length'), 
{code}

This piece of code gets called for every single host in the cluster every time we reload the
alerts from the server.
There are several approaches to fix this problem:
1. The server should have alert info as part of the Host resource.  This way, we can simply
map it and the client does not have to do much.  This will be done in 1.5.0 with changes to
Nagios alerting.
2. Since 1 won't be done until 1.5.0, we are left with the choice to improve efficiency of
the front code.  Upon loading alerts, we can load them into a map so that look up by host
(and service) would be fast; Ember's filterProperty is a linear search so it is very inefficient,
especially on a large array, like a list of all alerts in the cluster and doing this over
and over again for all the hosts in the cluster.  Also, we can sum up and store the aggregate
count (like total number of hosts with critical alerts) as we map alerts.  I'm speculating
that we can get a big perf boost just by doing these things.


  was:
On a 600-node cluster, the Hosts page hangs for about 5 seconds and then unblocks for about
10 seconds, then freezes for 5 seconds, etc.
Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the CPU.  This is
called by App.MainHostView's *hostCounts*, which is called by App.MainHostView's *label*.
 This seems to be the cause for this freeze/unfreeze behavior. 

{code}
    criticalAlertsCount: function () {
      return App.router.get('clusterController.alerts').filterProperty('hostName', this.get('hostName')).filterProperty('isOk',
false).filterProperty('ignoredForHosts', false).length;
    }.property('App.router.clusterController.alerts.length'), 
{code}

This piece of code gets called for every single host in the cluster every time we reload the
alerts from the server.
There are several approaches to fix this problem:
1. The server should have alert info as part of the Host resource.  This way, we can simply
map it and the client does not have to do much.  This will be done in Baikal via BUG-11704.
2. Since 1 won't be done until Baikal, we are left with the choice to improve efficiency of
the front code.  Upon loading alerts, we can load them into a map so that look up by host
(and service) would be fast; Ember's filterProperty is a linear search so it is very inefficient,
especially on a large array, like a list of all alerts in the cluster and doing this over
and over again for all the hosts in the cluster.  Also, we can sum up and store the aggregate
count (like total number of hosts with critical alerts) as we map alerts.  I'm speculating
that we can get a big perf boost just by doing these things.



> Perf issues on Hosts page - freezes for several seconds and then unfreezes repeatedly
on a large cluster
> --------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-4133
>                 URL: https://issues.apache.org/jira/browse/AMBARI-4133
>             Project: Ambari
>          Issue Type: Task
>    Affects Versions: 1.4.3
>            Reporter: Andrii Tkach
>            Assignee: Andrii Tkach
>            Priority: Critical
>             Fix For: 1.4.3
>
>         Attachments: AMBARI-4133.patch
>
>
> On a 600-node cluster, the Hosts page hangs for about 5 seconds and then unblocks for
about 10 seconds, then freezes for 5 seconds, etc.
> Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the CPU.  This
is called by App.MainHostView's *hostCounts*, which is called by App.MainHostView's *label*.
 This seems to be the cause for this freeze/unfreeze behavior. 
> {code}
>     criticalAlertsCount: function () {
>       return App.router.get('clusterController.alerts').filterProperty('hostName', this.get('hostName')).filterProperty('isOk',
false).filterProperty('ignoredForHosts', false).length;
>     }.property('App.router.clusterController.alerts.length'), 
> {code}
> This piece of code gets called for every single host in the cluster every time we reload
the alerts from the server.
> There are several approaches to fix this problem:
> 1. The server should have alert info as part of the Host resource.  This way, we can
simply map it and the client does not have to do much.  This will be done in 1.5.0 with changes
to Nagios alerting.
> 2. Since 1 won't be done until 1.5.0, we are left with the choice to improve efficiency
of the front code.  Upon loading alerts, we can load them into a map so that look up by host
(and service) would be fast; Ember's filterProperty is a linear search so it is very inefficient,
especially on a large array, like a list of all alerts in the cluster and doing this over
and over again for all the hosts in the cluster.  Also, we can sum up and store the aggregate
count (like total number of hosts with critical alerts) as we map alerts.  I'm speculating
that we can get a big perf boost just by doing these things.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message