hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Gummadi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5572) The map progress value should have a separate phase for doing the final sort.
Date Wed, 06 May 2009 16:11:30 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ravi Gummadi updated HADOOP-5572:

    Attachment: HADOOP-5572.v1.patch

Incorporated Jothi's 1st 3 comments.
Discussed with Jothi offline regarding comments 4 & 5. For comment 4, there seems to be
no cleaner way, so keeping it that way. Regarding comment 5, it seems checking for empty segments(by
reading segments) before actual merges seem to be costly in terms of performance. So not handling
empty segments separately in our estimation assuming that it wouldn't hurt much in the approximation
of mergeProgress.

Fixed an issue in informReduceProgress() by changing the call from Progress.get() to Progress.getInternal()
because we need progress for this phase/node only(and not for the whole tree). Made Progress.getInternal()

Attaching patch with the above changes. Please review and provide your comments.

> The map progress value should have a separate phase for doing the final sort.
> -----------------------------------------------------------------------------
>                 Key: HADOOP-5572
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5572
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-5572.patch, HADOOP-5572.v1.patch
> Currently, the final spill and sort doesn't record any progress while it runs, leading
to the perception that the map is done, but "stuck".

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message