hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2339) optimize JobInProgress.getTaskInProgress(taskid)
Date Tue, 19 Jul 2011 02:22:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067452#comment-13067452
] 

Liyin Liang commented on MAPREDUCE-2339:
----------------------------------------

Nice patch´╝ü
A user submitted a job with more than 680,000 map tasks to our cluster. Then jobtracker become
inefficient to process heartbeats, many threads are blocked and lots of requests are queued.
Through jstack of JobTracker process, we find most of the time are spent on JIP.getTaskInProgress().
This patch is a good way to improve JIP.getTaskInProgress()'s performance and fix our problem.

> optimize JobInProgress.getTaskInProgress(taskid)
> ------------------------------------------------
>
>                 Key: MAPREDUCE-2339
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2339
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.20.2, 0.21.0
>            Reporter: Kang Xiao
>         Attachments: MAPREDUCE-2339.patch, MAPREDUCE-2339.patch
>
>
> JobInProgress.getTaskInProgress(taskid) use a linner search to get the TaskInProgress
object by taskid. In fact, it can be replaced by much more efficient array index operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message