hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Separate communications of HDFS and MapReduce
Date Mon, 26 Apr 2010 18:10:24 GMT

On Apr 26, 2010, at 6:23 AM, Druilhe Remi wrote:
> For example, when I run "wordcount" example, there is HDFS communications and MapReduce
communications and I am not able to distinguish which packet belong to HDFS or to MapReduce.

This shouldn't be too surprising given that the MapReduce job needs to talk to HDFS to determine
input and to write output.

> A way could be to use odd port number for HDFS and even port number for MapReduce, but
I think I have to modify source code.

The ports for the services are already separated out.  

In general, client -> server connections map out as:

MR -> MR, HDFS
HDFS -> HDFS

Given a small 3 node grid, a dump of what processes open what ports, and what connections
are made between all the machines, it should be trivial to make a more complex connection
map.  [You can probably even do it as a map reduce job. :) ]
Mime
View raw message