Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 28195 invoked from network); 17 May 2006 05:06:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 May 2006 05:06:51 -0000 Received: (qmail 48052 invoked by uid 500); 17 May 2006 05:06:40 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 47969 invoked by uid 500); 17 May 2006 05:06:38 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 47930 invoked by uid 99); 17 May 2006 05:06:38 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2006 22:06:38 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2006 22:06:36 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7DA5F71423B for ; Wed, 17 May 2006 05:06:07 +0000 (GMT) Message-ID: <28507110.1147842367512.JavaMail.jira@brutus> Date: Wed, 17 May 2006 05:06:07 +0000 (GMT+00:00) From: "Owen O'Malley (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc In-Reply-To: <6413697.1146765138386.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12412092 ] Owen O'Malley commented on HADOOP-195: -------------------------------------- I wrote a network bandwidth tester that just uses Java sockets to connect all nodes to all nodes. My application waits until all of the servers are up and starts sending data (10g/node) using Java's sockets. On my cluster, which is currently at 202 nodes, it took an average of 1423 seconds (mean of 1630) to finish the transfer between nodes. That is substantially faster than Hadoop's shuffle (7 hours?) and means that we have a long way to go in terms of shuffle optimization. > transfer map output transfer with http instead of rpc > ----------------------------------------------------- > > Key: HADOOP-195 > URL: http://issues.apache.org/jira/browse/HADOOP-195 > Project: Hadoop > Type: Improvement > Components: mapred > Versions: 0.2 > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Fix For: 0.3 > Attachments: MapFileSimulator.java, data-transfer-chart.pdf, mapfilesimulator-big.txt, mapfilesimulator-sort2.txt, netstat.log, netstat.xls > > The data transfer of the map output should be transfered via http instead rpc, because rpc is very slow for this application and the timeout behavior is suboptimal. (server sends data and client ignores it because it took more than 10 seconds to be received.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira