hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "YourNetworkYourProblem" by SteveLoughran
Date Mon, 30 Dec 2013 12:49:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "YourNetworkYourProblem" page has been changed by SteveLoughran:

add an explicit page about your network being your problem

New page:
= Your Network Your Problem =

Hadoop is a distributed application that runs across a cluster of machines.

For it to work, all these machines must be able to find each other, to talk to each other,
and indeed, simply identify themselves so that other machines in the cluster can find them.

Externally accessible Hadoop clusters need to be visible across the rest of the network which
needs access to it. 

And of course, all these machines need to be wired together using network switches and routers.

For that reason, network setup is a critical part of a Hadoop cluster. If you do not do this,
Hadoop will not work and you will be left staring at stack traces in Hadoop logs trying to
diagnose what is wrong. You may even file bug reports saying "Help! Hadoop doesn't work!"

It does work for everybody else -and the reason it does not work for you is because the network
is misconfigured it doesn't.

And, because it is your network, nobody else is going to fix it for you --except in the special
case that you are using a paid packaging of Hadoop, where you should contact your vendor and
ask them for help. The Hadoop developers cannot and will not help you: filing bug reports
will simply result in the issue being closed as invalid along with a link to the InvalidJiraIssues

Here are some of the common problems in network and host configurations

 1. DNS and reverse DNS broken/non-existent.
 2. Host tables in the machines invalid.
 3. Firewalls in the hosts blocking connections.
 4. Routers blocking traffic.
 5. Hosts with multiple network cards listening/talking on the wrong NIC.
 5. Difference between the hadoop configuration files' definition of the cluster (especially
hostnames and ports) from that of the actual cluster setup.

The TroubleShooting page lists some recurrent error messages, possible root causes and ways
to track down the problem.

If these don't work, you could consider asking for help on the hadoop user list -but remember,
it is your network, and nobody else is going to be able to fix it.

The key point to remember is this: it is your network that

View raw message