hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size
Date Wed, 01 Feb 2017 01:06:51 GMT

    [ https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847836#comment-15847836

Daniel Templeton commented on YARN-6125:

Thanks for the patch.  Some comments:

* In the property, let's use "limit" instead of "capacity."  It sounds firmer. :)
might be better.  Then you can default to 64 instead of 65536.
* I can't comment on the POM change.  Anyone else want to comment?
* {{Lists.newLinkedList()}} is a Java 6 thing.  Just use the {{LinkedList}} constructor directly.
* I don't think we need the default constructor.
* You should do parameter validation in {{RMAppAttemptImpl}} before passing the capacity to
the constructor.  You can offer a more informative error message that way.
* {{ensureNull()}} should probably be {{ensureNotNull()}}.
* I don't see any good reason to have {{inputLength}} be final.
* Your checks at the beginning of {{cutAtLeast()}}, {{checkAndCut()}}, and append will bring
down the RM if they're violated.  That's bad.  You should probably log and ignore instead,
probably at the level of the RM, instead of in the appender.  Except for {{checkAndCut()}},
where you should either trim down the new message to fit or allow the oversized message to
be appended.  Probably the former.
* Your last append should probably just call the first append, and the first one should probably
call the second one.
* Just leave out the javadoc on the overridden methods instead of stubbing it out.
* The description in {{yarn-default.xml}}, the description should maybe be "Defines the maximum
capacity of the diagnostic message for each application attempt, in bytes.  When using ZooKeeper
to store application state behavior, it's important to limit the size of the diagnostic messages
to prevent YARN from overwhelming ZooKeeper.  In cases where yarn.resourcemanager.state-store.max-completed-applications
is set to a large number, it may be desirable to reduce the value of this property to limit
the total data stored."
* No need for the empty {{setup()}} in the test.
* {{initWithPositiveCapacitySuccess()}} should assert something. :)

> The application attempt's diagnostic message should have a maximum size
> -----------------------------------------------------------------------
>                 Key: YARN-6125
>                 URL: https://issues.apache.org/jira/browse/YARN-6125
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Daniel Templeton
>            Assignee: Andras Piros
>            Priority: Critical
>             Fix For: 3.0.0-alpha3
>         Attachments: YARN-6125.000.patch, YARN-6125.001.patch, YARN-6125.002.patch
> We've found through experience that the diagnostic message can grow unbounded.  I've
seen attempts that have diagnostic messages over 1MB.  Since the message is stored in the
state store, it's a bad idea to allow the message to grow unbounded.  Instead, there should
be a property that sets a maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were due to the
size of the diagnostic messages and not to the size of the classpath, as is the current prevailing
> An open question is how best to prune the message once it grows too large.  Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message