hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.
Date Tue, 04 Jun 2019 03:51:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855280#comment-16855280
] 

Tao Yang commented on YARN-8995:
--------------------------------

Thanks [~zhuqi] for the patch.

I prefer not maintain a global map (Map<Enum, Long> eventTypeRecord) which will be
updated twice (in & out) for every event, after all it's necessary only when something goes
wrong which could rarely happen. I think count events in realtime may be enough, Thoughts?

For the latest event, also we can record it only when necessary, for example, use a boolean
flag to control whether to record the next event and should record one event at a time.

{quote}

now i hard code to 5000

{quote}

I suppose it should be configurable, you can set 5000 as default.

{quote}

if we need print the event type size in order?

{quote}

I'm not sure what you mean, for example: "E1:3,E2:2,E1:1,..." when event types in queue
are "E1,E1,E1,E2,E2,E1,..." ? I think it's unnecessary if it is.

> Log the event type of the too big AsyncDispatcher event queue size, and add the information
to the metrics. 
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8995
>                 URL: https://issues.apache.org/jira/browse/YARN-8995
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: metrics, nodemanager, resourcemanager
>    Affects Versions: 3.2.0
>            Reporter: zhuqi
>            Assignee: zhuqi
>            Priority: Major
>         Attachments: YARN-8995.001.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event queues
to block the performance of the cluster, such as the bug of  https://issues.apache.org/jira/browse/YARN-5262
. I think it's necessary to log the event type of the too big event queue size, and add the
information to the metrics, and the threshold of queue size is a parametor which can be changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message