hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4842) Shuffle race can hang reducer
Date Thu, 06 Dec 2012 21:37:10 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13514648#comment-13514648
] 

Jason Lowe commented on MAPREDUCE-4842:
---------------------------------------

If the problem is we are creating too many merges, it seems Asokan's approach would have the
same issue, no?  We would schedule merges immediately upon hitting the commit threshold since
it wouldn't delay if a merge was in progress, rather it would queue up that next merge chunk
on the list.  Or maybe I'm misunderstanding the proposed change?

Asokan, please post a patch.  It would help ensure we all are on the same page.  Thanks!
                
> Shuffle race can hang reducer
> -----------------------------
>
>                 Key: MAPREDUCE-4842
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.0.2-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, MAPREDUCE-4842.patch,
MAPREDUCE-4842.patch
>
>
> Saw an instance where the shuffle caused multiple reducers in a job to hang.  It looked
similar to the problem described in MAPREDUCE-3721, where the fetchers were all being told
to WAIT by the MergeManager but no merge was taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message