[ https://issues.apache.org/jira/browse/YARN-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318461#comment-14318461
]
Junping Du commented on YARN-2079:
----------------------------------
Thanks [~jlowe] for updating the patch! The patch looks fine in overall. However, a small
bug need to be fixed here:
In NonAggregatingLogHandler.java,
{code}
+ private void recover() throws IOException {
+ if (stateStore.canRecover()) {
...
+ long deleteDelayMsec = now - proto.getDeletionTime();
...
+ sched.schedule(logDeleter, deleteDelayMsec, TimeUnit.MILLISECONDS);
+ }
+ }
+ }
{code}
I think proto.getDeletionTime() is the original deletion time, so left milliseconds to delete
should be: proto.getDeletionTime() - now. If this value is minus, then we should set 0 instead.
In addition, do we want to handle RejectedExecutionException in schedule with following the
same practice as other places, e.g. handle(LogHandlerEvent event)?
{code}
catch (RejectedExecutionException e) {
// Handling this event in local thread before starting threads
// or after calling sched.shutdownNow().
logDeleter.run();
}
{code}
Other looks fine to me.
> Recover NonAggregatingLogHandler state upon nodemanager restart
> ---------------------------------------------------------------
>
> Key: YARN-2079
> URL: https://issues.apache.org/jira/browse/YARN-2079
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.4.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: YARN-2079.002.patch, YARN-2079.patch
>
>
> The state of NonAggregatingLogHandler needs to be persisted so logs are properly deleted
across a nodemanager restart.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|