hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "paul sutter (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc
Date Sun, 07 May 2006 20:50:21 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12378315 ] 

paul sutter commented on HADOOP-195:
------------------------------------


eric,

most of my suggestions relate to the copy phase of the sort path, not the sort itself. once
that is working, i can make sort suggestions (although my best sort suggestion is for you
guys to talk with david cossock about sorts).

this whole area is critical. on that cluster, owen's 2TB should sort in 10 minutes, and the
data should be copied in less than that time, for a total run time of <20 minutes. 

pleased that yahoo has resources to apply. 

paul

> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>
>          Key: HADOOP-195
>          URL: http://issues.apache.org/jira/browse/HADOOP-195
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3

>
> The data transfer of the map output should be transfered via http instead rpc, because
rpc is very slow for this application and the timeout behavior is suboptimal. (server sends
data and client ignores it because it took more than 10 seconds to be received.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message