hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7422) Application History Server URL does not direct to the appropriate UI for failed/killed jobs
Date Wed, 01 Nov 2017 13:49:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234083#comment-16234083

Jason Lowe commented on YARN-7422:

I am a little confused on the goals of this JIRA.  This cannot be solved solely on the AM
side.  The AM runs on nodes that can fail at any time without giving the AM any chance to
perform corrective measures regarding the tracking URL.  AM behavior could make this better
but it cannot be completely solved there.

Completely solving this requires changes on the YARN server side because the AM could catastrophically
fail without executing any shutdown code.  One simple method that would work for Tez and possibly
other frameworks would be supporting a history URL during registration in addition to the
one already supported at unregistration.  Then as long as the AM registers we have a place
to direct users after the app completes even if the AM subsequently crashes spectacularly
and never unregisters.  If the AM crashes before it even registers then I would argue the
existing AHS-only history behavior should be sufficient since it's not likely the AM had a
chance to do any framework-specific behavior before it registered.

> Application History Server URL does not direct to the appropriate UI for failed/killed
> -------------------------------------------------------------------------------------------
>                 Key: YARN-7422
>                 URL: https://issues.apache.org/jira/browse/YARN-7422
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.8.1
>            Reporter: Kuhu Shukla
>            Priority: Major
> In cases where AM fails fatally, the AHS page's history link does not work since AM was
not able to update the trackingURL for the job. This JIRA is to track any last attempt effort
we can do from the AM to allow a tracking URL in cases where the AM failure does not occur
immediately at start up. Any ideas and corrections would be appreciated. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message