hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc
Date Thu, 11 May 2006 02:49:05 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12379001 ] 

eric baldeschwieler commented on HADOOP-195:

The problem with such approaches is that they add a lot of complexity and break down in some
cases.  They are particularly challenged by node failures and graceful handling of those is
a key requirement to scaling up.  I think we have a lot of good optimizations we can do without
changing the model.  We should do those first and gain some experience operating them before
radically departing from it.

Can you give us some data on the workload challenges you are facing?  In our current benchmark
we are not to a place where these issues are the logical ones to tackle.  Maybe you could
publish an alternate that expresses the demands of your workload?

> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>          Key: HADOOP-195
>          URL: http://issues.apache.org/jira/browse/HADOOP-195
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: netstat.log, netstat.xls
> The data transfer of the map output should be transfered via http instead rpc, because
rpc is very slow for this application and the timeout behavior is suboptimal. (server sends
data and client ignores it because it took more than 10 seconds to be received.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message