hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4183) Enabling generic application history forces every job to get a timeline service delegation token
Date Mon, 16 Nov 2015 20:14:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007242#comment-15007242

Sangjin Lee commented on YARN-4183:

I agree we probably shouldn't put too many points of discussion here that may not be core
to this JIRA at hand. I'd like to focus on the SystemMetricsPublisher and yarn.resourcemanager.system-metrics-publisher.enabled
and yarn.timeline-service.enabled.

bq. as far as 2.7.2 is concerned i feel yarn.resourcemanager.system-metrics-publisher.enabled
is sufficient to be configured.

I'm not sure if that is desirable. Here is a key question. Suppose the timeline service is
disabled, and no timeline daemons are running. And suppose yarn.resourcemanager.system-metrics-publisher.enabled
is *true*, and we changed SystemMetricsPublisher to check only that flag. What would happen?
AFAICT, the SystemMetricsPublisher will fire up the timeline client, and will try to send
all the events actively to the timeline server. But since the timeline server is down, it
will lead to continuous failures of writing to the timeline server, right? IMO, this type
of very late failures is deeply unsatisfying and problematic.

If the answer is "yarn.resourcemanager.system-metrics-publisher.enabled should not be set
to true if the timeline service is disabled", then it only makes it clear that yarn.resourcemanager.system-metrics-publisher.enabled=true
implies yarn.timeline-service.enabled=true. Then we should check it explicitly. Thoughts?

bq. As far as i view it "yarn.timeline-service.enabled"* name is misleading, it should be
more to signify client requires the timeline service's delegation token. Which will not be
a server side config. Thoughts?

I'm not sure if that's how it's currently interpreted, but the way I view it is that it should
act as a "master switch" for the timeline service; i.e. the highest level switch that toggles
the feature on and off on all sides. There can be "sub-switches" that can control finer-grained
parts of the feature (e.g. the system metrics publisher). But those subfeatures should always
check the master switch before checking their own. This will lead to a clean and consistent
pattern of using the feature everywhere.

Also, consider the fact that the system metrics publisher may not be the only server-side
component that interacts with the timeline service. There may be others and there will be
more with the timeline service v.2 (e.g. NM collector service, etc.). If they all handle the
failure case of the timeline server not being up in their own way, it would be quite confusing
and error-prone. It would be consistent and easy to handle if everyone checks the master switch
(and possibly their own subfeature switch), and wires off the feature as early as possible.
So I would argue that yarn.timeline-service.enabled should be interpreted as such a "master
switch", both for server-side and client-side.

I'd like to hear your thoughts. Thanks!

> Enabling generic application history forces every job to get a timeline service delegation
> ------------------------------------------------------------------------------------------------
>                 Key: YARN-4183
>                 URL: https://issues.apache.org/jira/browse/YARN-4183
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: Mit Desai
>            Assignee: Mit Desai
>         Attachments: YARN-4183.1.patch
> When enabling just the Generic History Server and not the timeline server, the system
metrics publisher will not publish the events to the timeline store as it checks if the timeline
server and system metrics publisher are enabled before creating a timeline client.
> To make it work, if the timeline service flag is turned on, it will force every yarn
application to get a delegation token.
> Instead of checking if timeline service is enabled, we should be checking if application
history server is enabled.

This message was sent by Atlassian JIRA

View raw message