yes of course. Agree with your analysis.
On May 7, 2006, at 1:50 PM, paul sutter (JIRA) wrote:
> [ http://issues.apache.org/jira/browse/HADOOP-195?
> page=comments#action_12378315 ]
>
> paul sutter commented on HADOOP-195:
> ------------------------------------
>
>
> eric,
>
> most of my suggestions relate to the copy phase of the sort path,
> not the sort itself. once that is working, i can make sort
> suggestions (although my best sort suggestion is for you guys to
> talk with david cossock about sorts).
>
> this whole area is critical. on that cluster, owen's 2TB should
> sort in 10 minutes, and the data should be copied in less than that
> time, for a total run time of <20 minutes.
>
> pleased that yahoo has resources to apply.
>
> paul
>
>> transfer map output transfer with http instead of rpc
>> -----------------------------------------------------
>>
>> Key: HADOOP-195
>> URL: http://issues.apache.org/jira/browse/HADOOP-195
>> Project: Hadoop
>> Type: Improvement
>
>> Components: mapred
>> Versions: 0.2
>> Reporter: Owen O'Malley
>> Assignee: Owen O'Malley
>> Fix For: 0.3
>
>>
>> The data transfer of the map output should be transfered via http
>> instead rpc, because rpc is very slow for this application and the
>> timeout behavior is suboptimal. (server sends data and client
>> ignores it because it took more than 10 seconds to be received.)
>
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the
> administrators:
> http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
> http://www.atlassian.com/software/jira
>
|