hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1343) NodeManagers additions/restarts are not reported as node updates in AllocateResponse responses to AMs
Date Thu, 24 Oct 2013 13:56:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804200#comment-13804200
] 

Alejandro Abdelnur commented on YARN-1343:
------------------------------------------

[~bikassaha], I disagree that this is not a bug.

>From {{AllocateResponse}} javadocs

{code}
  /**
   * Get the list of <em>updated <code>NodeReport</code>s</em>. Updates
could
   * be changes in health, availability etc of the nodes.
   * @return The delta of updated nodes since the last response
   */
  @Public
  @Stable
  public abstract  List<NodeReport> getUpdatedNodes();
{code}

And from {{AMRMClientAsync}}:

{code}
    /**
     * Called when nodes tracked by the ResourceManager have changed in health,
     * availability etc.
     */
    public void onNodesUpdated(List<NodeReport> updatedNodes);
{code}

A node re/joining is *availability*.

Following your reasoning, a LOST node should not be reported as it is gone, it does have a
status any more. But it is currently reported.

The current behavior is asymmetric and not expected and that should be fixed along the lines
of this JIRA.

And we can follow up with another JIRA to improve things as you suggested.

> NodeManagers additions/restarts are not reported as node updates in AllocateResponse
responses to AMs
> -----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1343
>                 URL: https://issues.apache.org/jira/browse/YARN-1343
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>            Priority: Critical
>             Fix For: 2.2.1
>
>         Attachments: YARN-1343.patch
>
>
> If a NodeManager joins the cluster or gets restarted, running AMs never receive the node
update indicating the Node is running.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message