hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6647) MR usage counters use the resources requested instead of the resources allocated
Date Tue, 27 Jun 2017 16:54:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065122#comment-16065122
] 

Haibo Chen commented on MAPREDUCE-6647:
---------------------------------------

Thanks for pointing this out [~vrushalic]! I agree this will change what downstream apps see.
But the change is not modifying the counter to report what the actual usage is.
What it does is that it changes resource requested by tasks to actual resource allocation
by YARN, i.e. how much resource was actually reserved for the job. The bug we
were trying to fix is that if a map task requires 1GB to run, and somehow the minimum allocation
of the cluster is set to 2GB, the cluster will actually reserve 2GB for the task.

Can you please elaborate a little more on how downstream apps make use of these counters?


Anyway, I think we should revert this from branch-2. [~rkanter] Thoughts?

> MR usage counters use the resources requested instead of the resources allocated
> --------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6647
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6647
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>             Fix For: 2.9.0, 3.0.0-alpha1
>
>         Attachments: mapreduce6647.001.patch, mapreduce6647.002.patch, mapreduce6647.003.patch,
mapreduce6647.004.patch
>
>
> As can be seen in the following snippet, the MR counters for usage use the resources
requested instead of the resources allocated. The scheduler increment-allocation-mb configs
could lead to these values not being the same. We could change the counters to use the allocated
resources in order to account for this.
> {code}
>   private static void updateMillisCounters(JobCounterUpdateEvent jce,
>       TaskAttemptImpl taskAttempt) {
>      /***omitted**/
>     long duration = (taskAttempt.getFinishTime() - taskAttempt.getLaunchTime());
>     int mbRequired =
>         taskAttempt.getMemoryRequired(taskAttempt.conf, taskType);
>     int vcoresRequired = taskAttempt.getCpuRequired(taskAttempt.conf, taskType);
>     int minSlotMemSize = taskAttempt.conf.getInt(
>       YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB,
>       YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB);
>     int simSlotsRequired =
>         minSlotMemSize == 0 ? 0 : (int) Math.ceil((float) mbRequired
>             / minSlotMemSize);
>     if (taskType == TaskType.MAP) {
>       jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_MAPS, simSlotsRequired * duration);
>       jce.addCounterUpdate(JobCounter.MB_MILLIS_MAPS, duration * mbRequired);
>       jce.addCounterUpdate(JobCounter.VCORES_MILLIS_MAPS, duration * vcoresRequired);
>       jce.addCounterUpdate(JobCounter.MILLIS_MAPS, duration);
>     } else {
>       jce.addCounterUpdate(JobCounter.SLOTS_MILLIS_REDUCES, simSlotsRequired * duration);
>       jce.addCounterUpdate(JobCounter.MB_MILLIS_REDUCES, duration * mbRequired);
>       jce.addCounterUpdate(JobCounter.VCORES_MILLIS_REDUCES, duration * vcoresRequired);
>       jce.addCounterUpdate(JobCounter.MILLIS_REDUCES, duration);
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message