hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc
Date Wed, 24 May 2006 19:05:30 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12413166 ] 

Sameer Paranjpye commented on HADOOP-195:

I'd agree that the callRaw stuff belongs in Client.java, I've copied the call dispatch code
from Client.sendParam, but that's a violation of abstraction. Really wanted to get this off
the ground so I didn't mess too much with the Client code. The approach you outline is the
cleaner way to do it.


Owen's testing the parallel fetch against an HTTP server by having jetty serve the map outputs
and it appears to be working a lot better than the RPC server. If we go that direction, we
might decide we don't need 'callRaw' at all since this is the only use case so far.

Either way I can commit to cleaning this up.

> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>          Key: HADOOP-195
>          URL: http://issues.apache.org/jira/browse/HADOOP-195
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: MapFileSimulator.java, data-transfer-chart.pdf, mapfilesimulator-big.txt,
mapfilesimulator-sort2.txt, netstat.log, netstat.xls, parallel-copiers.txt
> The data transfer of the map output should be transfered via http instead rpc, because
rpc is very slow for this application and the timeout behavior is suboptimal. (server sends
data and client ignores it because it took more than 10 seconds to be received.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message