hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc
Date Wed, 24 May 2006 18:28:31 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12413160 ] 

Doug Cutting commented on HADOOP-195:

This is great!

The RPC.callRaw() method really seems like it belongs in Client.java, and should mostly be
code that's shared with non-raw calls.  Otherwise we duplicate code: if we change the format
of calls on the wire or the way errors are handled, etc. this code will break.

So instead we might:

1. Make RPC.Invocation public.
2. Add a boolean 'newConnection' to Client.call(), adding the signature:
    Writable call(Writable, InetAddress, newConnection);
    This method can be implemented by constructing a Connection instance, but never calling
connection.start(), moving the body of Connection.run() to a new method, Connection.getResponse()
that can be called from Client.call() when newConnection is specified.

Does this make sense?

This way, e.g., if we want to specify a connect timeout for RPC connections (first creating
the socket, then explicitly connecting), change buffer sizes, etc. we can do it in one place.

If there's agreement to clean this up (or that it doesn't need cleaning up!) then I can commit
this as-is, and we can clean that up as a subsequent step, so that the patch doesn't grow
too huge, the codebase doesn't move, etc.

> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>          Key: HADOOP-195
>          URL: http://issues.apache.org/jira/browse/HADOOP-195
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: MapFileSimulator.java, data-transfer-chart.pdf, mapfilesimulator-big.txt,
mapfilesimulator-sort2.txt, netstat.log, netstat.xls, parallel-copiers.txt
> The data transfer of the map output should be transfered via http instead rpc, because
rpc is very slow for this application and the timeout behavior is suboptimal. (server sends
data and client ignores it because it took more than 10 seconds to be received.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message