hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Sutter" <psut...@quantcast.com>
Subject RE: [jira] Commented: (HADOOP-195) transfer map output transfer with http instead of rpc
Date Thu, 11 May 2006 16:11:21 GMT

I assumed it was on one big switch. Sounds like an easy theory to test, and
even easier to fix. 

How hard is it to monitor traffic levels on the switch cross-connects?

-----Original Message-----
From: Doug Cutting (JIRA) [mailto:jira@apache.org] 
Sent: Thursday, May 11, 2006 9:03 AM
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-195) transfer map output transfer with
http instead of rpc

    [
http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_1237909
2 ] 

Doug Cutting commented on HADOOP-195:
-------------------------------------

Everything we're now seeing is consistent with the inter-rack switches being
the primary bottleneck.  With 188 nodes sharing a 1Gb/s backbone, there's
only 600KB/s per node.  We're seeing 10 80kB files transferred per second,
or 800kB/second, slightly higher, since some files are already on the same
rack.

Instead of caching temp files in RAM we can instead try to transfer files
soon after they are generated and to process them on the remote end soon
after they are recieved.  That way we can benefit from the kernel's cache,
getting performance similar to what we'd see if we cached them ourselves.

> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>
>          Key: HADOOP-195
>          URL: http://issues.apache.org/jira/browse/HADOOP-195
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: data-transfer-chart.pdf, netstat.log, netstat.xls
>
> The data transfer of the map output should be transfered via http instead
rpc, because rpc is very slow for this application and the timeout behavior
is suboptimal. (server sends data and client ignores it because it took more
than 10 seconds to be received.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



Mime
View raw message