Stefan, The mapoutput files are not located in DFS, they are on the local disks of the mapper that creates them, avoiding the 3X replication overhead of DFS. Previously, the files were transferred using the RPC mechanism built for status updates among the nodes. This mechanism multiplexed multiple RPCs through a single TCP connection, and there were various logjams within it. The most straightforward fix was to use HTTP instead of trying to resolve the logjam(s). Its a quick solution to a problem that was really a bottleneck for some of us. Do you see a drawback to HTTP? Paul On 6/1/06, Stefan Groschupf wrote: > > Hi Owen, Hi All, > > a silly question, please give me some glue. > Why we use now http for mapoutput transfer instead of tcp or the dfs > itself? > Sorry but the issue HADOOP-254 doesn't give very much information > just that it is faster, what surprise me a little bit. > > > Thanks. > Stefan > > > > > >