hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-254) use http to shuffle data between the maps and the reduces
Date Thu, 25 May 2006 20:25:30 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-254?page=comments#action_12413298 ] 

Doug Cutting commented on HADOOP-254:
-------------------------------------

This looks great!

A couple of improvements:

1. in MapOutputLocation.getFile(), shouldn't things be closed in a 'finally' clause?
2. does MapOutputFile still need to be a Writable?  I don't think so.  We should remove its
write & readFields implementations & any other methods that are no longer called.
3. do we have any way to detect when map outputs are lost or corrupted?  that was a useful
mechanism that i'd hate to lose.
4. Sameer promised that you'd remove RPC.callRaw() in this patch.


> use http to shuffle data between the maps and the reduces
> ---------------------------------------------------------
>
>          Key: HADOOP-254
>          URL: http://issues.apache.org/jira/browse/HADOOP-254
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: http-shuffle.patch
>
> To speed up the shuffle time, I'll use http (via the task tracker's jetty server) to
send the map outputs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message