hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4091) Improvement: Introduce more debug/diagnostics information to detail out scheduler activity
Date Wed, 09 Sep 2015 18:17:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737337#comment-14737337

Wangda Tan commented on YARN-4091:


bq. However, my doubt is , we cannot do this for each heartbeat. If we want to do a specific
heartbeat for a specific node, we need input from external way. Such a command or REST query

That is what I meant! We will do such debug logging totally on demand. In my mind, the REST
API looks like:
- Request: contains nodeId as parameter.
- Response: "pending fetching" when the request accepted. After the requested nodeId finished
heartbeat, it contains all debug information.

I feel like we may not need queue/application as input, since we can make sure node is doing
heartbeat every few seconds, we doesn't know if a queue/app will be accessed. We can do highlight
in web UI for specified queue/application.

> Improvement: Introduce more debug/diagnostics information to detail out scheduler activity
> ------------------------------------------------------------------------------------------
>                 Key: YARN-4091
>                 URL: https://issues.apache.org/jira/browse/YARN-4091
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: Improvement on debugdiagnostic information - YARN.pdf
> As schedulers are improved with various new capabilities, more configurations which tunes
the schedulers starts to take actions such as limit assigning containers to an application,
or introduce delay to allocate container etc. 
> There are no clear information passed down from scheduler to outerworld under these various
scenarios. This makes debugging very tougher.
> This ticket is an effort to introduce more defined states on various parts in scheduler
where it skips/rejects container assignment, activate application etc. Such information will
help user to know whats happening in scheduler.
> Attaching a short proposal for initial discussion. We would like to improve on this as
we discuss.

This message was sent by Atlassian JIRA

View raw message