hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-3656) fetcher should re-use connection when it needs to fetch multiple segments from the same task tracker
Date Fri, 27 Jun 2008 17:15:44 GMT
fetcher should re-use connection when it needs to fetch multiple segments from the same task
tracker
----------------------------------------------------------------------------------------------------

                 Key: HADOOP-3656
                 URL: https://issues.apache.org/jira/browse/HADOOP-3656
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Runping Qi



In the current implementation, the fetcher will fetch one segment a time from a task tracker.
In the case where a job has N mappers per tracker, each reducer will need N trips to each
tracker.
That will generate a lot of network traffic when N and the number of reduces is large.
The problem will be improved if the fetcher can retrieve multiple segments from a tracker
per connection, either through http keep alive or 
through application level protocol.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message