hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-254) use http to shuffle data between the maps and the reduces
Date Thu, 25 May 2006 20:25:30 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-254?page=comments#action_12413298 ] 

Doug Cutting commented on HADOOP-254:

This looks great!

A couple of improvements:

1. in MapOutputLocation.getFile(), shouldn't things be closed in a 'finally' clause?
2. does MapOutputFile still need to be a Writable?  I don't think so.  We should remove its
write & readFields implementations & any other methods that are no longer called.
3. do we have any way to detect when map outputs are lost or corrupted?  that was a useful
mechanism that i'd hate to lose.
4. Sameer promised that you'd remove RPC.callRaw() in this patch.

> use http to shuffle data between the maps and the reduces
> ---------------------------------------------------------
>          Key: HADOOP-254
>          URL: http://issues.apache.org/jira/browse/HADOOP-254
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: http-shuffle.patch
> To speed up the shuffle time, I'll use http (via the task tracker's jetty server) to
send the map outputs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message