hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-2047) reduce overhead of findSpeculativeTask
Date Wed, 01 Sep 2010 20:41:57 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joydeep Sen Sarma updated MAPREDUCE-2047:
-----------------------------------------

    Attachment: mapreduce-2047.1.patch

optimizes findSpeculativeTask to find speculatable candidates periodically instead of per
heartbeat.

another simple change is also attached here (just doesn't seem worh a separate jira) -  removes
a synchronized invocation in getTasksToKill that also seems very expensive based on some profiling.

> reduce overhead of findSpeculativeTask
> --------------------------------------
>
>                 Key: MAPREDUCE-2047
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2047
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>         Environment: hadoop-20 with HADOOP-2141
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: mapreduce-2047.1.patch
>
>
> We are bottlenecked (in the JT) on the jobtracker lock and calls to findSpeculativeTask
frequently show up as one of the top routines (by time) called holding this lock.
> this routine calls canBeSpeculated() and hasRunOnMachine() for each task in a candidate
job. Both these routines are reasonably expensive when invoked repeatedly  for thousands of
tasks. The top candidates for speculation from a job only need to be refreshed periodically
(and not once every heartbeat) - and we can can avoid most of these invocations this way.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message