hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1105) Reducers don't make "progress" while iterating through values
Date Mon, 02 Apr 2007 06:48:32 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Devaraj Das updated HADOOP-1105:

    Attachment: 1105.patch

This patch makes some fields volatile (the fields that reportProgress uses, since reportProgress
happens in a separate thread), instead of making the "synchronized" call at the ReduceTask.
The main change here (in the patches) is that reportProgress is not done as part of every
invocation of reducer(key, value[]); instead, the progress field is just set, and the thread
does the actual job of reporting those to the tasktracker. This saves the overhead of making
RPC connections to a overloaded/slow TaskTracker inline with the reducer method invocations.
Ditto for Reporter.setStatus call (in Task.java). It should improve the reducer performance

> Reducers don't make "progress" while iterating through values
> -------------------------------------------------------------
>                 Key: HADOOP-1105
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1105
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.12.3
>         Attachments: 1105.patch, 1105.patch
> Reduces make progress when they go to a new key, but not when they read the next value,
which could cause reduces to time out when they have a lot of values for the same key.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message