hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HadoopIPv6" by SteveLoughran
Date Wed, 20 Jan 2010 20:27:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HadoopIPv6" page has been changed by SteveLoughran.
The comment on this change is: More on IPv6.
http://wiki.apache.org/hadoop/HadoopIPv6?action=diff&rev1=1&rev2=2

--------------------------------------------------

- Some general info on how to avoid Hadoop problems in IPv6-enabled servers
+ = Hadoop and IPv6 =
  
+ Apache Hadoop does not currently support IPv6 networks, it uses IPv4 addresses for communicating
between nodes. This is because Hadoop is designed to work in private datacenters, which usually
have private IP addresses in the 10.x.x.x address space.  
+ 
+  1. Using IPv4 addresses everywhere provides a single form of TCP addressing for all our
tests. Different network configurations (DNS, reverse DNS, DNS caching) still provide lots
of problems and performance issues, but there is no need to worry about which IP protocol
version is used.
+  1. Shorter addresses make for shorter packets, which can have a benefit on busy networks.

+ 
+ This does not mean that the Hadoop team thinks that IPv4 is the best ever network protocol
and that there is no reason to upgrade ever, only that it works well in datacenters. If you
are using Hadoop in other places you may encounter problems. A key limitation of this design
decision is that it means Hadoop needs IPv4 to work, and only IPv4 clients can talk to the
cluster. Equally critically, MapReduce jobs cannot talk to services, including web services,
that only work on IPv6. If your organisation moves to IPv6 only, you will encounter problems.
+ 
+ 
+ In the mean time, the main concern is that the linux distribution tries to force Hadoop
to use IPv6, which does not work.
+  1. Many recent Linux distributions do not allow you to turn IPv6 off. There is a risk that
Hadoop or Jetty-under-Hadoop has picked up an IPv6 address, which is why other machines may
not be able to talk to it.
+  1. Later Linux releases default to being IPv6 only. That means unless the systems are configured
to re-enable IPv4, some machines break. As of Jan 2010, this was causing problems in Debian
[[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=560044|1]], [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=560056|2]],
which is then leading to bug reports in other programs [[http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6342561|Sun
bug database]], [[https://issues.apache.org/jira/browse/HADOOP-6056|Apache Jira]].
+ 

Mime
View raw message