hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wen Lin <w...@pivotal.io>
Subject Re: Meet A java null pointer when use unmanaged AM and yarn.resourcemanager.system-metrics-publisher.enabled is enable
Date Mon, 14 Dec 2015 08:10:06 GMT
Hi, Yarn Developers,

I have found another reproduce for this bug.
I am using hortonworks' simple-yarn-app(
https://github.com/hortonworks/simple-yarn-app), and start it in "unmanaged
AM" mode.
On Hadoop Yarn, timeline server is started/enabled and
"yarn.resourcemanager.system-metrics-publisher.enabled" is enable.
On the client side, the output is:

[gpadmin@master simple-yarn-app]$ hadoop jar
~/hadoop/singlecluster/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0.3.0.0.0-120.jar
Client --classpath  ./target/simple-yarn-app-1.1.0.jar -cmd "java
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 2"

15/12/14 07:57:04 INFO unmanagedamlauncher.UnmanagedAMLauncher:
Initializing Client

15/12/14 07:57:06 INFO impl.TimelineClientImpl: Timeline service address:
http://master:8188/ws/v1/timeline/

15/12/14 07:57:06 INFO unmanagedamlauncher.UnmanagedAMLauncher: Starting
Client

15/12/14 07:57:06 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting up
application submission context for ASM

15/12/14 07:57:06 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting
unmanaged AM

15/12/14 07:57:06 INFO unmanagedamlauncher.UnmanagedAMLauncher: Submitting
application to ASM

15/12/14 07:57:07 INFO impl.YarnClientImpl: Submitted application
application_1450079798629_0001

15/12/14 07:57:08 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1,
appStartTime=1450079827025, yarnAppState=ACCEPTED,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:08 INFO unmanagedamlauncher.UnmanagedAMLauncher: Launching
AM with application attempt id appattempt_1450079798629_0001_000001

15/12/14 07:57:09 INFO impl.ContainerManagementProtocolProxy:
yarn.client.max-cached-nodemanagers-proxies : 0

registerApplicationMaster 0

registerApplicationMaster 1

Making res-req 0

Making res-req 1

15/12/14 07:57:10 INFO impl.AMRMClientImpl: Received new token for :
node2:46502

Launching container container_1450079798629_0001_01_000001

15/12/14 07:57:10 INFO impl.ContainerManagementProtocolProxy: Opening proxy
: node2:46502

15/12/14 07:57:10 INFO impl.AMRMClientImpl: Received new token for :
node3:28286

Launching container container_1450079798629_0001_01_000002

15/12/14 07:57:10 INFO impl.ContainerManagementProtocolProxy: Opening proxy
: node3:28286

Completed container container_1450079798629_0001_01_000001

Completed container container_1450079798629_0001_01_000002

15/12/14 07:57:11 INFO unmanagedamlauncher.UnmanagedAMLauncher: AM process
exited with value: 0

15/12/14 07:57:12 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:13 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:14 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:15 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:16 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:17 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:18 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:19 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:20 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:21 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:22 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got
application report from ASM for, appId=1,
appAttemptId=appattempt_1450079798629_0001_000001, clientToAMToken=null,
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0,
appStartTime=1450079827025, yarnAppState=RUNNING,
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=gpadmin

15/12/14 07:57:22 WARN unmanagedamlauncher.UnmanagedAMLauncher: Waited 10
seconds after process completed for AppReport to reach desired final state.
Not waiting anymore.CurrentState = RUNNING, ExpectedStates =
FINISHED,FAILED,KILLED

15/12/14 07:57:23 FATAL unmanagedamlauncher.UnmanagedAMLauncher: Error
running Client

java.lang.RuntimeException: Failed to receive final expected state in
ApplicationReport, CurrentState=RUNNING,
ExpectedStates=FINISHED,FAILED,KILLED

at
org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:487)

at
org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:355)

at
org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.main(UnmanagedAMLauncher.java:110)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

In Yarn's RM log file, there is also a javaNullException, same with the
situation I use libyarn as a "unmanaged AM" mentioned in my first email.

2015-12-14 07:57:06,960 INFO
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated
new applicationId: 1

640 2015-12-14 07:57:07,025 WARN
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific
max attempts: 0 for application: 1 is invalid, because it is out of the
range [1, 2]. Use the global max attempts instead.

641 2015-12-14 07:57:07,029 INFO
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application
with id 1 submitted by user gpadmin

642 2015-12-14 07:57:07,029 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing
application with id application_1450079798629_0001

643 2015-12-14 07:57:07,029 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=gpadmin
IP=192.168.2.201    OPERATION=Submit Application Request
TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_145007979862
9_0001

644 2015-12-14 07:57:07,031 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1450079798629_0001 State change from NEW to NEW_SAVING

645 2015-12-14 07:57:07,043 INFO
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
Storing info for app: application_1450079798629_0001

646 2015-12-14 07:57:07,044 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1450079798629_0001 State change from NEW_SAVING to SUBMITTED

647 2015-12-14 07:57:07,046 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Application added - appId: application_1450079798629_0001 user: gpadmin
leaf-queue of parent: root #applications: 1

648 2015-12-14 07:57:07,047 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Accepted application application_1450079798629_0001 from user: gpadmin, in
queue: default

649 2015-12-14 07:57:07,105 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1450079798629_0001 State change from SUBMITTED to ACCEPTED

650 2015-12-14 07:57:07,146 INFO
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering app attempt : appattempt_1450079798629_0001_000001

651 2015-12-14 07:57:07,147 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1450079798629_0001_000001 State change from NEW to SUBMITTED

652 2015-12-14 07:57:07,162 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
Application application_1450079798629_0001 from user: gpadmin activated in
queue: default

653 2015-12-14 07:57:07,162 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
Application added - appId: application_1450079798629_0001 user:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.
LeafQueue$User@68d50d2f, leaf-queue: default #user-pending-applications: 0
#user-active-applications: 1 #queue-pending-applications: 0
#queue-active-applications: 1

654 2015-12-14 07:57:07,162 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Added Application Attempt appattempt_1450079798629_0001_000001 to scheduler
from user gpadmin in queue default

655 2015-12-14 07:57:07,163 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Storing attempt: AppId: application_1450079798629_0001 AttemptId:
appattempt_1450079798629_0001_000001 MasterContainer: null

656 2015-12-14 07:57:07,165 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1450079798629_0001_000001 State change from SUBMITTED to
LAUNCHED_UNMANAGED_SAVING

657 2015-12-14 07:57:07,178 INFO
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
Create AMRMToken for ApplicationAttempt:
appattempt_1450079798629_0001_000001

658 2015-12-14 07:57:07,183 INFO
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
Creating password for appattempt_1450079798629_0001_000001

659 2015-12-14 07:57:07,188 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1450079798629_0001_000001 State change from
LAUNCHED_UNMANAGED_SAVING to LAUNCHED

660 2015-12-14 07:57:10,078 INFO
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
appattempt_1450079798629_0001_000001 (auth:SIMPLE)

661 2015-12-14 07:57:10,108 INFO
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM
registration appattempt_1450079798629_0001_000001

662 2015-12-14 07:57:10,109 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=gpadmin
IP=192.168.2.201    OPERATION=Register App Master
TARGET=ApplicationMasterService RESULT=SUCCESS
APPID=application_145007979862    9_0001
APPATTEMPTID=appattempt_1450079798629_0001_000001

663 2015-12-14 07:57:10,111 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in
handling event type REGISTERED for applicationAttempt
application_1450079798629_0001

664 java.lang.NullPointerException

665     at
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)

666     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)

667     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)

668     at
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)

669     at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)

670     at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)

671     at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)

672     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)

673     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)

674     at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)

675     at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)

676     at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)

677     at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)

678     at java.lang.Thread.run(Thread.java:745)

679 2015-12-14 07:57:10,113 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1450079798629_0001 State change from ACCEPTED to RUNNING

680 2015-12-14 07:57:10,196 ERROR
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Can't handle this event at current state

681 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
event: STATUS_UPDATE at LAUNCHED

682     at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)

683     at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)

684     at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)

685     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)

686     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)

687     at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)

688     at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)

689     at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)

690     at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)

691     at java.lang.Thread.run(Thread.java:745)

692 2015-12-14 07:57:10,331 ERROR
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
Can't handle this event at current state

693 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
event: STATUS_UPDATE at LAUNCHED

694     at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)

695     at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)

696     at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)

697     at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)




So I can conclude that,  if a unmanaged AM tries to register in Yarn, when
timeline server is configured and
"yarn.resourcemanager.system-metrics-publisher.enabled"
is enable, a java NullPointerException occurs in Yarn.
Would you please fix it?
Thanks!

On Thu, Dec 10, 2015 at 10:45 PM, Wen Lin <wlin@pivotal.io> wrote:

> Hi, Yarn Developers,
>
> I am writing to report a strange case I meet here.
> I was using libyarn(
> https://github.com/apache/incubator-hawq/tree/master/depends/libyarn), a
> kind of C++ client library to communicate with Hadoop Yarn. It acts as a
> "Unmanaged Application Master", doing things such as register application,
> request to allocate/release containers from Yarn, activate containers,
> etc(sample app is in /libyarn/sample/c_client_main.c).
>
> What is strange to me is, when configure timeline server and
> yarn.resourcemanager.system-metrics-publisher.enabled is enable in
> yarn-site.xml. Something strange happens.
>
> If the property "yarn.resourcemanager.system-metrics-publisher.enabled" is
> not enable and no timeline server is configured, things goes well.
> Once Yarn RM gets a heartbeat from my application, the progress becomes
> 50%, 50% is hard coded in allocate request.
>
> [gpadmin@master ~]$ yarn application -list
>
> 15/12/09 08:12:54 INFO impl.TimelineClientImpl: Timeline service address:
> http://master:8188/ws/v1/timeline/
>
> Total number of applications (application-types: [] and states:
> [SUBMITTED, ACCEPTED, RUNNING]):1
>
>                 Application-Id     Application-Name     Application-Type
>     User      Queue              State        Final-State        Progress
>                     Tracking-URL
>
> application_1449630430655_0007              libyarn                 YARN
> gpadmin    default            RUNNING          UNDEFINED             50%
>                               url
>
> However if timeline server is configured, and
>  "yarn.resourcemanager.system-metrics-publisher.enabled" is enable,
> Even my application can allocate/activate/release containers, the progress
> of my application is ALWAYS 0%!
>
> I've check the log of Yarn resource manager, I find
> a NullPointerException in log.
> I've google this exception with such callstack. there is a question, which
> is similar to mine.
>
> http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/
>
> I can see below information in log in log file:
> 2015-12-10 02:52:19,025 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Storing attempt: AppId: application_1449744734026_0001 AttemptId:
> appattempt_1449744734026_0001_000001 MasterContainer: null
> ...
> 2015-12-10 02:52:19,946 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in
> handling event type REGISTERED for applicationAttempt
> application_1449744734026_0001
> java.lang.NullPointerException
>         at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:145)
>
> Below is related Yarn codes:
>   private void storeAttempt() {
>     // store attempt data in a non-blocking manner to prevent dispatcher
>     // thread starvation and wait for state to be saved
>     LOG.info("Storing attempt: AppId: " +
>               getAppAttemptId().getApplicationId()
>               + " AttemptId: " +
>               getAppAttemptId()
>               + " MasterContainer: " + masterContainer);
>     rmContext.getStateStore().storeNewApplicationAttempt(this);
>   }
>
>   public void appAttemptRegistered(RMAppAttempt appAttempt,
>       long registeredTime) {
>     if (publishSystemMetrics) {
>       dispatcher.getEventHandler().handle(
>           new AppAttemptRegisteredEvent(
>               appAttempt.getAppAttemptId(),
>               appAttempt.getHost(),
>               appAttempt.getRpcPort(),
>               appAttempt.getTrackingUrl(),
>               appAttempt.getOriginalTrackingUrl(),
>               appAttempt.getMasterContainer().getId(),
>               registeredTime));
>     }
>   }
> I guess since libyarn is a "unmanaged" application master, there is no
> container allocated for it, so MasterContainer is null. But when Yarn
> register this application into SystemMetricsPublisher, it requires a
> container and its id. That's why this null exception happens.
> In a word, if a unmanaged AM tries to register in Yarn, when timeline
> server is configured and
>  "yarn.resourcemanager.system-metrics-publisher.enabled" is enable, a java
> NullPointerException occurs in Yarn.
> I am not sure if it's my app's fault or Yarn's bug. Would you please help
> me take a look at it?
>
> Thanks,
> Wen
>
> 2015-12-10 02:52:18,842 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated
> new applicationId: 1
> 2015-12-10 02:52:18,873 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application
> with id 1 submitted by user gpadmin
> 2015-12-10 02:52:18,874 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing
> application with id application_1449744734026_0001
> 2015-12-10 02:52:18,874 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=gpadmin
>  IP=10.32.34.218 OPERATION=Submit Application Request
>  TARGET=ClientRMService  RESULT=SUCCESS
>  APPID=application_1449744734026_0001
> 2015-12-10 02:52:18,883 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1449744734026_0001 State change from NEW to NEW_SAVING
> 2015-12-10 02:52:18,884 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
> Storing info for app: application_1449744734026_0001
> 2015-12-10 02:52:18,908 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1449744734026_0001 State change from NEW_SAVING to SUBMITTED
> 2015-12-10 02:52:18,910 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Application added - appId: application_1449744734026_0001 user: gpadmin
> leaf-queue of parent: root #applications: 1
> 2015-12-10 02:52:18,911 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
> Accepted application application_1449744734026_0001 from user: gpadmin, in
> queue: default
> 2015-12-10 02:52:18,969 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1449744734026_0001 State change from SUBMITTED to ACCEPTED
> 2015-12-10 02:52:19,004 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Registering app attempt : appattempt_1449744734026_0001_000001
> 2015-12-10 02:52:19,006 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1449744734026_0001_000001 State change from NEW to SUBMITTED
> 2015-12-10 02:52:19,023 WARN
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> maximum-am-resource-percent is insufficient to start a single application
> in queue, it is likely set too low. skipping enforcement to allow at least
> one application to start
> 2015-12-10 02:52:19,024 WARN
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> maximum-am-resource-percent is insufficient to start a single application
> in queue for user, it is likely set too low. skipping enforcement to allow
> at least one application to start
> 2015-12-10 02:52:19,024 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> Application application_1449744734026_0001 from user: gpadmin activated in
> queue: default
> 2015-12-10 02:52:19,024 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> Application added - appId: application_1449744734026_0001 user:
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@6a7edb15,
> leaf-queue: default #user-pending-applications: 0
> #user-active-applications: 1 #queue-pending-applications: 0
> #queue-active-applications: 1
> 2015-12-10 02:52:19,024 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
> Added Application Attempt appattempt_1449744734026_0001_000001 to scheduler
> from user gpadmin in queue default
> 2015-12-10 02:52:19,025 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Storing attempt: AppId: application_1449744734026_0001 AttemptId:
> appattempt_1449744734026_0001_000001 MasterContainer: null
> 2015-12-10 02:52:19,039 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1449744734026_0001_000001 State change from SUBMITTED to
> LAUNCHED_UNMANAGED_SAVING
> 2015-12-10 02:52:19,039 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Create AMRMToken for ApplicationAttempt:
> appattempt_1449744734026_0001_000001
> 2015-12-10 02:52:19,042 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Creating password for appattempt_1449744734026_0001_000001
> 2015-12-10 02:52:19,046 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1449744734026_0001_000001 State change from
> LAUNCHED_UNMANAGED_SAVING to LAUNCHED
> 2015-12-10 02:52:19,936 INFO SecurityLogger.org.apache.hadoop.ipc.Server:
> Auth successful for appattempt_1449744734026_0001_000001 (auth:SIMPLE)
> 2015-12-10 02:52:19,944 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM
> registration appattempt_1449744734026_0001_000001
> 2015-12-10 02:52:19,945 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=gpadmin
>  IP=10.32.34.218 OPERATION=Register App Master
> TARGET=ApplicationMasterService RESULT=SUCCESS
>  APPID=application_1449744734026_0001
>  APPATTEMPTID=appattempt_1449744734026_0001_000001
> 2015-12-10 02:52:19,946 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in
> handling event type REGISTERED for applicationAttempt
> application_1449744734026_0001
> java.lang.NullPointerException
>         at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:145)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1405)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1380)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:786)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:850)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:831)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-12-10 02:52:19,947 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1449744734026_0001 State change from ACCEPTED to RUNNING
> 2015-12-10 02:52:19,963 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
> event: STATUS_UPDATE at LAUNCHED
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:786)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:850)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:831)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-12-10 02:52:38,864 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node
> not found resyncing test2:9983
> 2015-12-10 02:52:39,884 INFO org.apache.hadoop.yarn.util.RackResolver:
> Resolved test2 to /default-rack
> 2015-12-10 02:52:39,886 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
> NodeManager from node test2(cmPort: 9983 httpPort: 8042) registered with
> capability: <memory:12288, vCores:6>, assigned nodeId test2:9983
> 2015-12-10 02:52:39,886 WARN
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Cannot get
> RMApp by appId=application_1449655982302_0004, just added it to
> finishedApplications list for cleanup
> 2015-12-10 02:52:39,886 WARN
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Cannot get
> RMApp by appId=application_1449743958580_0001, just added it to
> finishedApplications list for cleanup
> 2015-12-10 02:52:39,889 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: test2:9983
> Node Transitioned from NEW to RUNNING
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message