hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2901) Add errors and warning stats to RM, NM web UI
Date Tue, 24 Mar 2015 01:13:54 GMT

    [ https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377040#comment-14377040
] 

Wangda Tan commented on YARN-2901:
----------------------------------

Hi [~vvasudev],

I spent some time take a look at Log4JMetricsAppeneder implementation (will include other
modified component in next round).

1) Log4jMetricsAppender, 
1.1 Better to place in yarn-server-common?
1.2 If you agree above, how about put into package o.a.h.y.server.metrics (or utils)?
1.3 Rename it to Log4jWarnErrorMetricsAppender?
1.4 Comments about implementation:
I think currently, implementation of cleanup can be improved, now cutoff process of message/count
is basically loop all items stored, which could be inefficient (imaging if number of stored
message > threshold), existing logics in the patch would lead to lots of potential stored
message (tons of messages could be genereated in 5 min, which is purge message task run interval).

If you can make the data structure to be:
SortedMap<String, SortedMap<Long, Integer>> errors (and warnings), the outside
map is sorted by value (SortedMap with smallest timestamp goes first), and inside map is sorted
by key (smallest timestamp goes first), purge can happen when we add any event, it will just
take at most log(N=500) time to do the purge, and no extra timer task needed.

To make SortedMap can sort by value, one way to do that can refer to http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java
(first answer).

Here, value = SortedMap<Long, Integer>>, we can sort the SortedMaps according to
smallest key in each SortedMap.

And one corner case may need to consider is, it is possible a same message can have lots of
different timestamps, so we need purge the inner SortedMap too.

To make better code readability, you can wrap the SortedMap to a inner class like MessageInfo.

> Add errors and warning stats to RM, NM web UI
> ---------------------------------------------
>
>                 Key: YARN-2901
>                 URL: https://issues.apache.org/jira/browse/YARN-2901
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Exception collapsed.png, Exception expanded.jpg, Screen Shot 2015-03-19
at 7.40.02 PM.png, apache-yarn-2901.0.patch, apache-yarn-2901.1.patch
>
>
> It would be really useful to have statistics on the number of errors and warnings in
the RM and NM web UI. I'm thinking about -
> 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
> 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 hours/day
> By errors and warnings I'm referring to the log level.
> I suspect we can probably achieve this by writing a custom appender?(I'm open to suggestions
on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message