hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3127) Apphistory url crashes when RM switches with ATS enabled
Date Wed, 06 May 2015 00:10:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529606#comment-14529606

Naganarasimha G R commented on YARN-3127:

Thanks for reviewing [~gtCarrera9],  
Issue mentioned over here main cause is already addressed in another jira by [~xgong]  and
but when we test in this way we still get to see null in the webui and also more importantly
this jira addressing is required as events are published for every app (start and finished)
on RM failover. So if 10000 apps are maintained then so many additional non required events
are getting triggered. this we need to address. And for the issue pointed by [~xgong], i had
asked for suggestion of approach being taken and hence waiting for it, AFAIK we need to ensure
first ATS events are sent and then store the final application state to RMstate store in FINAL_SAVING
transition (and also other possible cases where app is created and will be killed b4 attempt
is created in which case  FINAL_SAVING is not called). If this approach is fine then will
update the patch and the description. 

> Apphistory url crashes when RM switches with ATS enabled
> --------------------------------------------------------
>                 Key: YARN-3127
>                 URL: https://issues.apache.org/jira/browse/YARN-3127
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, timelineserver
>    Affects Versions: 2.6.0
>         Environment: RM HA with ATS
>            Reporter: Bibin A Chundatt
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch
> 1.Start RM with HA and ATS configured and run some yarn applications
> 2.Once applications are finished sucessfully start timeline server
> 3.Now failover HA form active to standby
> 4.Access timeline server URL <IP>:<PORT>/applicationhistory
> Result: Application history URL fails with below info
> {quote}
> 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the
> java.lang.reflect.UndeclaredThrowableException
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> 	at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> 	...
> Caused by: org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The
entity for application attempt appattempt_1422972608379_0001_000001 doesn't exist in the timeline
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	... 51 more
> 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling
URI: /applicationhistory
> org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> {quote}
> Behaviour with AHS with file based history store
> 	-Apphistory url is working 
> 	-No attempt entries are shown for each application.
> Based on inital analysis when RM switches ,application attempts from state store  are
not replayed but only applications are.
> So when /applicaitonhistory url is accessed it tries for all attempt id and fails

This message was sent by Atlassian JIRA

View raw message