hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2022) Preempting an Application Master container can be kept as least priority when multiple applications are marked for preemption by ProportionalCapacityPreemptionPolicy
Date Wed, 11 Jun 2014 23:21:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028572#comment-14028572
] 

Vinod Kumar Vavilapalli commented on YARN-2022:
-----------------------------------------------

bq. Me and Vinod Kumar Vavilapalli were discussing about making it simple and if we can just
don't kill AM contianer that would be easier and will work well.
bq. I think many framweorks (MR, Tez etc) depends on last AM attempt.
Thanks for filling in on my behalf Mayank.

Here's the real issue. MapReduce AM does the following to figure out if this is the last retry
or not
{code}
      isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
{code}
Given the above, once we decide to preempt AMs, the default max-attempts of the AM (2) will
kick in and AMs fail as is already seen.

If we do YARN-2074, the failure of AMs is mitigated. But this still doesn't help MR AM as
it will start seeing the 2nd attempt as the last retry and clean up staging directory etc.

Once we have user-level preemption (YARN-2069) we are back to square one *even after* this
patch. We CANNOT do without preempting AMs. I think the right solution is to have YARN-2074
and then need a different fix to not have MR AM (or any YARN application) depend on AppAttemptID
to figure out whether it is the last retry or not. I'll file a JIRA.

I'm okay with the current fix to unblock preemption in the real short term. 

> Preempting an Application Master container can be kept as least priority when multiple
applications are marked for preemption by ProportionalCapacityPreemptionPolicy
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2022
>                 URL: https://issues.apache.org/jira/browse/YARN-2022
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-2022-DesignDraft.docx, YARN-2022.2.patch, YARN-2022.3.patch,
YARN-2022.4.patch, Yarn-2022.1.patch
>
>
> Cluster Size = 16GB [2NM's]
> Queue A Capacity = 50%
> Queue B Capacity = 50%
> Consider there are 3 applications running in Queue A which has taken the full cluster
capacity. 
> J1 = 2GB AM + 1GB * 4 Maps
> J2 = 2GB AM + 1GB * 4 Maps
> J3 = 2GB AM + 1GB * 2 Maps
> Another Job J4 is submitted in Queue B [J4 needs a 2GB AM + 1GB * 2 Maps ].
> Currently in this scenario, Jobs J3 will get killed including its AM.
> It is better if AM can be given least priority among multiple applications. In this same
scenario, map tasks from J3 and J2 can be preempted.
> Later when cluster is free, maps can be allocated to these Jobs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message