hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3046) [Event producers] Implement MapReduce AM writing some MR metrics to ATS
Date Tue, 14 Apr 2015 15:17:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494233#comment-14494233

Junping Du commented on YARN-3046:

Thanks [~zjshen] for review and comments!
bq. I'm not sure if we should have a MR config to determine is new or old timeline service.
If this MR config is set to true, but YARN cluster is still setup with old timeline service.
It still doesn't work.
Theoretically, the most beautiful solution is to let applications (MR, DS, etc.) doesn't aware
any version of timeline service. However, we already decided to go with different methods/structures
between v1 and v2 for TimelineClient, so application have to be aware of which version timeline
service get used. 
The next option is to let application figure out timeline related info from YARN/RM, it can
be done through registerApplicationMaster() in ApplicationMasterProtocol with return value
for service "off", "v1_on", or "v2_on".
The last option is as v1 patch shows which along the existing way for v1 service to enable
timeline service in a separated configuration: MRJobConfig.MAPREDUCE_JOB_EMIT_TIMELINE_DATA.
Personally, I would prefer the 2nd option. The reason is just like you mentioned, application
owner doesn't have to aware RM/YARN infrastructure details. However, this need change to YARN
AM protocol, and changes on different applications (distributed shell, etc.) and mark existing
MR configuration deprecated (or it would have conflict in principle of similar configurations).
I would prefer to file a separated JIRA to track this more carefully as this is important
but not the focus of this JIRA's scope. What do you think? 

bq. Node need to have JobHistoryEventUtils, you can move util method to JobHistoryUtils if
you want.
I tried to do so before I created JobHistoryEventUtils. However, I found we cannot do it because
JobHistoryUtils is in hadoop-mapreduce-client-common component, but some consumer of method
is in hadoop-mapreduce-client-core component (like: ReduceAttemptFinishedEvent, TaskAttemptFinishedEvent,
etc.). Currently, hadoop-mapreduce-client-common has dependency on hadoop-mapreduce-client-core,
so we don't allow these events under hadoop-mapreduce-client-core to depend on JobHistoryUtils
which will cause bidirectional dependency issue. The bad news is we cannot move JobHistoryUtils
to  hadoop-mapreduce-client-core either, because it has reference to other classes (like:
MRApps) that still in hadoop-mapreduce-client-common. That's why I create JobHistoryEventUtils
for shared methods.

bq. In the current way of shutting down the threadpool, is it guaranteed that the pending
entity is going to be published before shutting down?
It will have delay (60 secs) to wait pending entity get posted, and the delay is typically
much larger than service discovery time (typically saying, heartbeat interval, not counting
collector failed over case) and timeline entity REST posting time. It also larger than every
entity posting time in case of failure with maximum retry (30 * 1 sec). So I think it could
be safe to do so here.

I will address other comments in new patch.

> [Event producers] Implement MapReduce AM writing some MR metrics to ATS
> -----------------------------------------------------------------------
>                 Key: YARN-3046
>                 URL: https://issues.apache.org/jira/browse/YARN-3046
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Junping Du
>         Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch, YARN-3046-v1-rebase.patch,
> Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes written) and
have the MR AM write the framework-specific metrics to ATS.

This message was sent by Atlassian JIRA

View raw message