hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2830) Need to instrument Hadoop to get comprehensive network traffic metrics
Date Thu, 14 Feb 2008 17:33:08 GMT
Need to instrument Hadoop to get comprehensive network traffic metrics
----------------------------------------------------------------------

                 Key: HADOOP-2830
                 URL: https://issues.apache.org/jira/browse/HADOOP-2830
             Project: Hadoop Core
          Issue Type: Improvement
            Reporter: Runping Qi



One of most often asked question regarding Hadoop performance is: was the job cpu bounded,
or disk bounded, or  network bounded.
The first two parts can be answered based on metric data of individual machines, thus are
relatively easy to answer.
The third part is much harder, especially for a large cluster. To unswer the question, we
need to know the followings:

1. The network traffic to and from the nodes in the cluster
2. The network traffic going between node pairs through the switch they share
3. The network traffic going through the back links between the switches

With these data, we can get a better insight on the relationship between  network bandwidth
and hadoop performance.

We need to instrument the Hadoop code to obtain the above data.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message