hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Radwan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy
Date Wed, 08 Aug 2012 22:27:21 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431462#comment-13431462

Ahmed Radwan commented on MAPREDUCE-4469:

Many thanks Todd! I agree, I looked more into how these values are updated, I thought the
streaming process is still accounted for because of the cumulative nature of how these values
are calculated. For example, in getCumulativeCpuTime():

    cpuTime += incJiffies * JIFFY_LENGTH_IN_MILLIS;
    return cpuTime;

But seems that the pTree and its values are only updated when getProcResourceValues() is called,
and it is only called from initialize() and updateResourceCounters() in the Task.

So Basically any resource changes, between two calls of getProcResourceValues(), won't be
accounted for.

Since this overhead is happening with every update from the task, what if we add a new configuration
property that defines a number of update skips before updating the resource counters. For
example, the resource counters will be only updated every 10 updates (by default), but the
user can still configure the resolution of these updates through this configuration property.
What do you think?
> Resource calculation in child tasks is CPU-heavy
> ------------------------------------------------
>                 Key: MAPREDUCE-4469
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: performance, task
>    Affects Versions: 1.0.3
>            Reporter: Todd Lipcon
>            Assignee: Ahmed Radwan
>         Attachments: MAPREDUCE-4469.patch
> In doing some benchmarking on a hadoop-1 derived codebase, I noticed that each of the
child tasks was doing a ton of syscalls. Upon stracing, I noticed that it's spending a lot
of time looping through all the files in /proc to calculate resource usage.
> As a test, I added a flag to disable use of the ResourceCalculatorPlugin within the tasks.
On a CPU-bound 500G-sort workload, this improved total job runtime by about 10% (map slot-seconds
by 14%, reduce slot seconds by 8%)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message