ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-13974) Retreiving Failed Service Checks Takes Too Long On Large Clusters
Date Fri, 20 Nov 2015 06:31:10 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015305#comment-15015305
] 

Hudson commented on AMBARI-13974:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.1 #901 (See [https://builds.apache.org/job/Ambari-branch-2.1/901/])
AMBARI-13974 - Retreiving Failed Service Checks Takes Too Long On Large (jhurley: [http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=83f5d0e4262882d75acc9e56524e7759913f6a7d])
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionManager.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java
* ambari-server/src/test/java/org/apache/ambari/server/controller/internal/AbstractResourceProviderTest.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessor.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/TaskStatusResponse.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostEntity_.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity_.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
* ambari-server/src/main/java/org/apache/ambari/server/controller/TaskStatusRequest.java
* ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java
* ambari-server/src/test/java/org/apache/ambari/server/controller/internal/TaskResourceProviderTest.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java


> Retreiving Failed Service Checks Takes Too Long On Large Clusters
> -----------------------------------------------------------------
>
>                 Key: AMBARI-13974
>                 URL: https://issues.apache.org/jira/browse/AMBARI-13974
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Critical
>             Fix For: 2.1.3
>
>         Attachments: AMBARI-13974.patch
>
>
> *STR:*
> * Launch Rolling Upgrade on big cluster (500+ node)
> * Proceed to Finalize step
> *Actual Result:*
> Call: 
> {code}
> /api/v1/clusters/c500/upgrades/69/upgrade_groups?upgrade_items/UpgradeItem/status=COMPLETED&upgrade_items/tasks/Tasks/status.in(FAILED,ABORTED,TIMEDOUT)&upgrade_items/tasks/Tasks/command=SERVICE_CHECK&fields=upgrade_items/tasks/Tasks/command_detail,upgrade_items/tasks/Tasks/status&minimal_response=true
> {code}
> This call fails due to timeout. No failed Service Checks shown to user.
> The root of the problem is how the REST API handles subqueries. For every group that
matches, it will attempt to retrieve every stage and every task and then produce a slice of
results from in-memory comparison.
> This should really go through the JPA layer since it's simple comparisons on DB fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message