ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-13065) RU: Core Slaves restart schedule is extremely slow on very large cluster
Date Sat, 19 Sep 2015 03:45:04 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876866#comment-14876866
] 

Hudson commented on AMBARI-13065:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.1 #559 (See [https://builds.apache.org/job/Ambari-branch-2.1/559/])
AMBARI-13065: RU: Core Slaves restart schedule is extremely slow on very large cluster (jluniya)
(jluniya: http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=1aae6477b777996b3a1a2abfe30de471b9f4a85f)
* ambari-server/src/main/java/org/apache/ambari/server/utils/LoopBody.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java
* ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
* ambari-server/src/test/java/org/apache/ambari/server/utils/TestParallel.java
* ambari-server/src/main/java/org/apache/ambari/server/utils/Parallel.java
* ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Request.java
* ambari-server/src/main/java/org/apache/ambari/server/utils/ParallelLoopResult.java


> RU: Core Slaves restart schedule is extremely slow on very large cluster
> ------------------------------------------------------------------------
>
>                 Key: AMBARI-13065
>                 URL: https://issues.apache.org/jira/browse/AMBARI-13065
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.2
>            Reporter: Jayush Luniya
>            Assignee: Jayush Luniya
>            Priority: Blocker
>             Fix For: 2.1.2
>
>
> Performed RU on 1200 node cluster and the progress of 'Core Slaves' restarts is extremely
slow - In 3 hours it restarted only 22 components (screenshot attached). At this rate it will
take weeks for RU to complete.
> It we look into the agent log where RU core-slaves finished, we see that sequential commands
are sent 8 minutes apart - which is very slow. The commands themselves execute in under a
minute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message