hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-254) use http to shuffle data between the maps and the reduces
Date Fri, 26 May 2006 18:41:31 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-254?page=all ]

Owen O'Malley updated HADOOP-254:

    Attachment: http-shuffle-2.patch

Ok, here is an updated patch that addresses Doug's concerns.

1. Local file system is now used for reading and writing the map output files. Hopefully,
this won't hurt our performance too much.

2. Exceptions reading the map outputs call TaskTracker.lostMapOutput so the map is marked
as failed. (I tested this by manually changing a character in one of the map output files
and making sure that the map reran.)

3. I removed the assumption that the TaskTracker is a singleton.

4. I added set/getAttribute on the StatusHttpServer so that the user can pass objects to the
jsp code.

5. I removed more of the dead code (RPC.callRaw, MapOutputFile)

> use http to shuffle data between the maps and the reduces
> ---------------------------------------------------------
>          Key: HADOOP-254
>          URL: http://issues.apache.org/jira/browse/HADOOP-254
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: http-shuffle-2.patch, http-shuffle.patch
> To speed up the shuffle time, I'll use http (via the task tracker's jetty server) to
send the map outputs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message