hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
Date Thu, 20 Mar 2014 02:05:43 GMT

    [ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941310#comment-13941310

Vinod Kumar Vavilapalli commented on YARN-1849:

Haven't looked at the patch, but in general there is a constant tussle between keeping things
up vs failing fast so as to be able to fix bugs.

I would in general avoid null checks unless I am sure - failing the RM/NM at least uncovers
the bug instead of limping with it and then breaking somewhere else at which point it becomes
hard to root-cause. If possible, let's fix what is actually broken here instead of putting
in a lot of null checks (if that is what the above comments are talking about). Sure, we may
run into one more issue that we haven't foreseen, but we can atleast comfort in knowing that
we are addressing the right corner cases.

> NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
> ----------------------------------------------------------------------------
>                 Key: YARN-1849
>                 URL: https://issues.apache.org/jira/browse/YARN-1849
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Blocker
>         Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, yarn-1849-3.patch
> While running an UnmanagedAM on secure cluster, ran into an NPE on failover/restart.
This is similar to YARN-1821. 

This message was sent by Atlassian JIRA

View raw message