hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1904) Reducing locking contention in TaskTracker.MapOutputServlet's LocalDirAllocator
Date Mon, 13 Sep 2010 04:19:34 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908604#action_12908604

Rajesh Balamohan commented on MAPREDUCE-1904:

Thanks for the review comments Arun. 

1. For #1, I would post the profiler output of which methods are expensive in getLocalPathToRead().

2. For #2, the code path for LocalDirAllocator.confChanged() need not be called in this context
of TaskTracker. 

Reason: In this context, TaskTracker is trying to check for any config changes related to
 "mapred.local.dir" using LocalDirAllocator. Once its read, this parameter does not change
over TaskTracker's lifetime. Hence, it is not mandatory to do this check for every invocation.
Corner case: When tasktracker goes down and new configs are reloaded, the LRUCache would also
be repopulated.  

> Reducing locking contention in TaskTracker.MapOutputServlet's LocalDirAllocator
> -------------------------------------------------------------------------------
>                 Key: MAPREDUCE-1904
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1904
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.20.1
>            Reporter: Rajesh Balamohan
>         Attachments: MAPREDUCE-1904-RC10.patch, MAPREDUCE-1904-trunk.patch, profiler
output after applying the patch.jpg, TaskTracker- yourkit profiler output .jpg, Thread profiler
output showing contention.jpg
> While profiling tasktracker with Sort benchmark, it was observed that threads block on
LocalDirAllocator.getLocalPathToRead() in order to get the index file and temporary map output
> As LocalDirAllocator is tied up with ServetContext,  only one instance would be available
per tasktracker httpserver.  Given the jobid & mapid, LocalDirAllocator retrieves index
file path and temporary map output file path. getLocalPathToRead() is internally synchronized.
> Introducing a LRUCache for this lookup reduces the contention heavily (LRUCache with
key =jobid +mapid and value=PATH to the file). Size of the LRUCache can be varied based on
the environment and I observed a throughput improvement in the order of 4-7% with the introduction
of LRUCache.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message