hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
Date Fri, 18 Mar 2016 14:28:33 GMT

    [ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201546#comment-15201546
] 

Sunil G commented on YARN-4837:
-------------------------------

Thanks [~vinodkv] for pitching in.

YARN-2005 blacklists nodes if AM container launch failed due to DISK_FAILED. And after YARN-4284,
blacklisting for am-container-failure is made for all container failure except PREEMPTED.
There were few discussion on usecase aspects for this change.

If blacklisting (am container failure) feature is enabled in cluster level, all applications
will be forced to comply the blacklisting rule. YARN-4389 had also an option to disable this
feature from application end. Also it could control the threshold if its too strict (and vice
versa). Yes, agreeing to your point and its early for user  to take blacklisting decisions
w/o having much needed/useful information. But by seeing the current aggressive nature, this
change was helping in skipping this feature.

Agreeing that this has to be a controllable feature without causing problems in a busy cluster.
I think may be a time based purging solution can be ideal to allow same app to use the node
again.

> User facing aspects of 'AM blacklisting' feature need fixing
> ------------------------------------------------------------
>
>                 Key: YARN-4837
>                 URL: https://issues.apache.org/jira/browse/YARN-4837
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed before we
release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message