hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7843) Container Localizer is failing with NPE
Date Tue, 30 Jan 2018 15:34:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345232#comment-16345232
] 

Jason Lowe commented on YARN-7843:
----------------------------------

Is this against 3.1.0?  I am guessing this is the relevant code that is crashing, but it
would be good to verify:
{code:java}
    LocalizedResource rsrc = localrsrc.get(req);
    rsrc.setLocalPath(localPath);
{code}

If that is indeed the case then it looks like a resource was removed just as a path was being
computed for localization.  I think I see some races where this could occur during cache cleanup
or maybe even a case where a resource was thought to be localized and disappeared, but I don't
see how this could happen for every container as implied in the description.

[~rohithsharma] have you checked the NM logs?  I'm curious if there are warning logs about
the resource missing and being relocalized or other indications that the resource was removed
from the cache just as another container was trying to use it.


> Container Localizer is failing with NPE
> ---------------------------------------
>
>                 Key: YARN-7843
>                 URL: https://issues.apache.org/jira/browse/YARN-7843
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Rohith Sharma K S
>            Priority: Blocker
>
> It is seen that container localizer are failing with NPE, as result none of container
are getting launched!
> {noformat}
> Caused by: java.lang.NullPointerException: java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:503)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.getPathForLocalization(ResourceLocalizationService.java:1189)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.processHeartbeat(ResourceLocalizationService.java:1153)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:753)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:371)
>         at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:48)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message