hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2340) optimize JobInProgress.initTasks()
Date Mon, 21 Feb 2011 07:37:38 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997292#comment-12997292
] 

Aaron T. Myers commented on MAPREDUCE-2340:
-------------------------------------------

Thanks a lot for doing this performance analysis, Kang. Your results seem promising.

Quick comment on the patch: it seems to me that if you find {{node}} to be {{null}} you should
then be assigning the result of {{jobtracker.resolveAndAddToTopology(host)}} to the {{node}}
variable. As it stands {{node}} will still be null entering the loop.

> optimize JobInProgress.initTasks()
> ----------------------------------
>
>                 Key: MAPREDUCE-2340
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2340
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.20.1, 0.21.0
>            Reporter: Kang Xiao
>         Attachments: MAPREDUCE-2340.patch
>
>
> JobTracker's hostnameToNodeMap cache can speed up JobInProgress.initTasks() and JobInProgress.createCache()
significantly. A test for 1 job with 100000 maps on a 2400 cluster shows nearly 10 and 50
times speed up for initTasks() and createCache(). 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message