ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <>
Subject [jira] [Commented] (AMBARI-17248) Reduce the idle time before first command from next stage is executed on a host
Date Wed, 22 Jun 2016 14:46:58 GMT


Hadoop QA commented on AMBARI-17248:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 4 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in ambari-server:


Test results:
Console output:

This message is automatically generated.

> Reduce the idle time before first command from next stage is executed on a host
> -------------------------------------------------------------------------------
>                 Key: AMBARI-17248
>                 URL:
>             Project: Ambari
>          Issue Type: Improvement
>          Components: ambari-agent, ambari-server
>            Reporter: Sebastian Toader
>            Assignee: Sebastian Toader
>             Fix For: 2.4.0
>         Attachments: AMBARI-17248.trunk.v7.patch
> Commands to be executed by ambari-agents are being sent down by the server in the response
message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next commands scheduled
to be executed by ambari-agent and adds those to the heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into stages. Ambari
server ensures that only the commands of a single stage is scheduled to be executed by the
agent and starts scheduling the commands of the next stage only after all commands of current
stage has finished successfully.
> The processing of command status received with the heartbeat message happens asynchronously
to heartbeat response in HeartBeatProcessor and ActionScheduler creation thus when the heartbeat
response is created the commands for the next stage are not scheduled yet. This means that
the next commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or at a timeout
interval which is ~10 seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the completion of the
last command from the current stage the server will send the commands for the next stage only
10 seconds later when the next heartbeat is received. This leads to agents spending considerable
amount of time idle when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages to be executed.

This message was sent by Atlassian JIRA

View raw message