Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Wed, 30 Apr 2014 23:24:18 +0000 (UTC)
From: "Wangda Tan (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12676558.1383115585632.210688.1398900258507@arcas>
In-Reply-To: <JIRA.12676558.1383115585632@arcas>
References: <JIRA.12676558.1383115585632@arcas>
Subject: =?utf-8?Q?[jira]_[Commented]_(YARN-1368)_Common_work_to?=
 =?utf-8?Q?_re-populate=C2=A0containers=E2=80=99_state_into_scheduler?=
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/YARN-1368?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13986=
213#comment-13986213 ]=20

Wangda Tan commented on YARN-1368:
----------------------------------

Thanks [~jianhe] for this proposal, I think recover container from NM heart=
beat is a reasonable way, +1 for general ideas,
Some minor comments,
bq. Noticed that FiCaSchedulerNode and FSSchedulerNode are almost the same.=
 Any reason for keeping both ? thinking to merge the common methods into Sc=
hedulerNode.
Currently IMO, we'd better keep both. To avoid involving too much parts in =
this JIRA, we can separate the merge common logic of them to a new task.
bq. ContainerStatus sent in NM registration doesn=E2=80=99t capture enough =
information for re-constructing the containers. we may replace that with a =
new object or just adding more fields to encapsulate all the necessary info=
rmation for re-constructing the container.
Personally I think create a new type specialized for container recovering i=
s better, ContainerStatus is also used in node heartbeat. Including too muc=
h fields in each heartbeat isn't safe or efficient

> Common work to re-populate=C2=A0containers=E2=80=99 state into scheduler
> -----------------------------------------------------------
>
>                 Key: YARN-1368
>                 URL: https://issues.apache.org/jira/browse/YARN-1368
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1368.preliminary.patch
>
>
> YARN-1367 adds support for the NM to tell the RM about all currently runn=
ing containers upon registration. The RM needs to send this information to =
the schedulers along with the NODE_ADDED_EVENT so that the schedulers can r=
ecover the current allocation state of the cluster.


--
This message was sent by Atlassian JIRA
(v6.2#6252)