hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4354) Public resource localization fails with NPE
Date Sun, 15 Nov 2015 12:36:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005866#comment-15005866

Junping Du commented on YARN-4354:

+1. Patch LGTM. Will commit it shortly.
bq. Looks like this can cause nodemanagers to crash as well.
To make NM more robust, I think we should tolerate this kind of failure/exception in LocalResourcesTracker
rather than making NM's dispatch to crash and exit. May be we can make LocalResourcesTracker
have a separated AsyncDispatcher to set "DISPATCHER_EXIT_ON_ERROR_KEY" to false like what
we do in RM for SchedulerEventDispatcher?

> Public resource localization fails with NPE
> -------------------------------------------
>                 Key: YARN-4354
>                 URL: https://issues.apache.org/jira/browse/YARN-4354
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.2
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: YARN-4354-unittest.patch, YARN-4354.001.patch, YARN-4354.002.patch
> I saw public localization on nodemanagers get stuck because it was constantly rejecting
requests to the thread pool executor.

This message was sent by Atlassian JIRA

View raw message