hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lin Wen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4452) NPE when submit Unmanaged application
Date Mon, 14 Dec 2015 14:20:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056030#comment-15056030
] 

Lin Wen commented on YARN-4452:
-------------------------------

I can see below information in log in Yarn's log file:
2015-12-10 02:52:19,025 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Storing attempt: AppId: application_1449744734026_0001 AttemptId: appattempt_1449744734026_0001_000001
MasterContainer: null
...
2015-12-10 02:52:19,946 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager:
Error in handling event type REGISTERED for applicationAttempt application_1449744734026_0001
java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:145)

I guess since there is no container allocated for "unmanaged" application master, so MasterContainer
is null. But when Yarn register this application into SystemMetricsPublisher, it requires
a container and its id. That's why this null exception happens.
 private void storeAttempt() {
    // store attempt data in a non-blocking manner to prevent dispatcher
    // thread starvation and wait for state to be saved
    LOG.info("Storing attempt: AppId: " + 
              getAppAttemptId().getApplicationId() 
              + " AttemptId: " + 
              getAppAttemptId()
              + " MasterContainer: " + masterContainer);
    rmContext.getStateStore().storeNewApplicationAttempt(this);
  }

  public void appAttemptRegistered(RMAppAttempt appAttempt,
      long registeredTime) {
    if (publishSystemMetrics) {
      dispatcher.getEventHandler().handle(
          new AppAttemptRegisteredEvent(
              appAttempt.getAppAttemptId(),
              appAttempt.getHost(),
              appAttempt.getRpcPort(),
              appAttempt.getTrackingUrl(),
              appAttempt.getOriginalTrackingUrl(),
              appAttempt.getMasterContainer().getId(),
              registeredTime));
    }
  }
In a word, if a unmanaged AM tries to register in Yarn, when timeline server is configured
and  "yarn.resourcemanager.system-metrics-publisher.enabled" is enable, a java NullPointerException
occurs in Yarn.

> NPE when submit Unmanaged application
> -------------------------------------
>
>                 Key: YARN-4452
>                 URL: https://issues.apache.org/jira/browse/YARN-4452
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Critical
>
> As reported in the forum by Wen Lin (wlin@pivotal.io)
> {quote}
> [gpadmin@master simple-yarn-app]$ hadoop jar
> ~/hadoop/singlecluster/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0.3.0.0.0-120.jar
> Client --classpath  ./target/simple-yarn-app-1.1.0.jar -cmd "java
> com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 2"
> {quote}
> error is coming as 
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event
type REGISTERED for applicationAttempt
> application_1450079798629_0001
> 664 java.lang.NullPointerException
> 665     at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
> 666     at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
> 667     at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
> 668     at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message