hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3366) Shuffle/Merge improvements
Date Fri, 06 Jun 2008 12:51:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603031#action_12603031
] 

Devaraj Das commented on HADOOP-3366:
-------------------------------------

Some comments: 
0) The inMem merge thread needs to ignore the criteria when the shuffle thread notifies it
to do a forced merge. 
1) A race condition exists in the interval between the ramManager.notify and mergePassComplete.wait()
calls in getMapOutput. What could happen is that the ramManager gets notified and it *finishes*
the merge *before* this thread calls mergePassComplete.wait(). If this happens the notification
from the merger is lost and this thread will just wait ... 
2) The handshake between the merger, copier and the ramManager looks complex and there could
be more race conditions like the one i pointed above. I and Sharad had a quick discussion
and we feel it can be simplified. 
   Have the ramManager.reserve lock the thread if the request cannot be satisfied 
   Have the ramManager.unreserve do a notifyAll (this the mergeThread does) 
   Have the shuffle thread notify the mergeThread (before it goes to wait) 

> Shuffle/Merge improvements
> --------------------------
>
>                 Key: HADOOP-3366
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3366
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.18.0
>
>         Attachments: 3366.1.patch, 3366.1.patch, HADOOP-3366_0_20080605.patch, ifile.patch
>
>
> This is intended to be a meta-issue to track various improvements to shuffle/merge in
the reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message