hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1270) Randomize the fetch of map outputs
Date Thu, 03 May 2007 20:14:15 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated HADOOP-1270:

    Status: Open  (was: Patch Available)

Sorry, this patch no longer applies cleanly to trunk.  Can you please generate a new version?

> Randomize the fetch of map outputs
> ----------------------------------
>                 Key: HADOOP-1270
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1270
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>         Attachments: HADOOP-1270_20070425_1.patch, post-H-1270.png, pre-H-1270.png
> HADOOP-248 did away with random probing of maps for locating map outputs and instead
we now rely on TaskCompletionEvents for the same. 
> However we lost out on the benefit that the randomization in probing resulted in an added
benefit where the map's jetty isn't overloaded with requests for the outputs. We have now
a situation where a map completes, the JT is notified, *all* the reduces get the TaskCompletionEvent
and pretty much swamp the poor map's jetty and this repeats for each map.
> I propose we make a minor change where we collect a set of TaskCompletionEvents and randomize
the list before firing the fetches. Should help fix this mass-hysteria at the map's jetty.
> Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message