hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-9035) Allow better troubleshooting of FS container assignments and lack of container assignments
Date Thu, 06 Dec 2018 19:51:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711944#comment-16711944
] 

Haibo Chen commented on YARN-9035:
----------------------------------

[~wilfreds] probably has a much better idea of what is preferred from a support-ability perspective.

> Allow better troubleshooting of FS container assignments and lack of container assignments
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-9035
>                 URL: https://issues.apache.org/jira/browse/YARN-9035
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: YARN-9035.001.patch
>
>
> The call chain started from {{FairScheduler.attemptScheduling}}, to {{FSQueue}} (parent
/ leaf).assignContainer and down to {{FSAppAttempt#assignContainer}} has many calls and has
many potential conditions where {{Resources.none()}} can be returned, meaning container is
not allocated.
>  A bunch of these empty-assignments do not come with a debug log statement, so it's very
hard to tell what condition lead the {{FairScheduler}} to a decision where containers are
not allocated.
>  On top of that, in many places, it's difficult to tell either why a container was allocated
to an app attempt.
> The goal is to have a common place (i.e. class) that will do all the loggings, so users
conveniently can control all the logs if they are curious why (and why not) container assigments
happened.
>  Also, it would be handy if readers of the log could easily decide which {{AppAttempt}} is
the log record created for, in other words: every log record should include the ID of the
application / app attempt, if possible.
>  
> Details of implementation: 
>  As most of the already in-place debug messages were protected by a condition that checks
whether the debug level is enabled on loggers, I followed a similar pattern. All the relevant
log messages are created with the class {{ResourceAssignment}}. 
>  This class is a wrapper for the assigned {{Resource}} object and has a single logger,
so clients should use its helper methods to create log records. There is a helper method called
{{shouldLogReservationActivity}} that checks if DEBUG or TRACE level is activated on the logger.

>  See the javadoc on this class for further information.
>  
> {{ResourceAssignment}} is also responsible for adding the app / appettempt ID to every
log record (with some exceptions).
>  A couple of check classes are introduced: They are responsible to run and store results
of checks that are dependency of a successful container allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message