hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiandan Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-9237) RM prints a lot of "Cannot get RMApp by appId" log when RM failover
Date Fri, 25 Jan 2019 08:18:00 GMT
Jiandan Yang  created YARN-9237:
-----------------------------------

             Summary: RM prints a lot of "Cannot get RMApp by appId" log when RM failover
                 Key: YARN-9237
                 URL: https://issues.apache.org/jira/browse/YARN-9237
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn
            Reporter: Jiandan Yang 


I found a lot of following log in active RM log file after doing  failover RM
{code:java}
2019-01-24 15:43:58,999 WARN org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl:
Cannot get RMApp by appId=application_1542178952162_34746156, just added it to finishedApplications
list for cleanup
.....
{code}
I looked forward RM logs and find this app had finished before hours
{code:java}
2019-01-23 21:49:55,683 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1542178952162_34746156_000001 State change from FINAL_SAVING to FINISHING
{code}
The reason of RM prints " Cannot get RMApp by appId"  is as follows:
1. RM failover
2. NM reports all running apps to RM in register request
3. The running apps are from NMContext, some apps may already finished
4. In my cluster, yarn.log-aggregation-enable=false, yarn.nodemanager.log.retain-seconds=86400(1day),
so app is kept in NMContext before app has finished for 24 hours
5. My Yarn cluster runs 50k apps per day and 7k nodes, and NM will report many finished apps
to RM.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message