hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hetzner <erik.hetz...@ucop.edu>
Subject Problem with slave nodes which return a bogus hostname
Date Mon, 01 Oct 2007 23:37:48 GMT
Hi all.

I am not sure if I should file a separate bug report or if this is a
duplicate of <http://issues.apache.org/jira/browse/HADOOP-1487>

I am trying to get started with running hadoop, but have encountered a
small problem. I am trying to run hadoop on a very small (2) cluster
of slave machines which are located on a local net behind another
machine. These machines do not have hostnames, only IPs. So the first
trouble I had was jobs not completing. Here is a sample from the logs:
(hadoop/logs/userlogs/task_200710011429_0001_r_000001_0)

007-10-01 14:47:16,545 WARN org.apache.hadoop.mapred.ReduceTask: java.io.FileNotFoundException:
http://localhost.localdomain:50060/mapOutput?map=task_200710
011429_0001_m_000017_0&reduce=1
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1151)
        at org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:207)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:701)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:659)

Tracking this down, I discover that ultimately the
localhost.localdomain, which of course cannot be resolved between
machines, comes from a line in the TaskTracker, where we get a
localhostname by calling DNS.getDefaultHost. I replaced this with
DNS.getDefaultIP. This gets us closer to a solution, but now we get
the following (in hadoop/logs/userlogs/*/syslog on a tasknode):

Caused by: java.net.URISyntaxException: Malformed escape pair at index 34: http://fe80:0:0:0:207:e9ff:fe1a:81%3:50060/tasklog?plaintext=true&taskid=task_2007
10011624_0001_m_000002_0

As you can see, the default IP is an ipv6 IP.

I have solved this problem for myself by adding a static
getDefaultIPv4 method to the DNS class which grabs the first IP
matching an IPv4 regex. This seems to work for me, as I can now
complete wordcount tasks, but I am not sure if it is the correct
solution.

Additionally, I wonder if somebody might explain to me why I must set
the mapred.tasktracker.dns.interface property in hadoop-site rather
than in mapred-default? Is this intentional, or an oversite?

Many thanks for your help in advance.

best,
Erik Hetzner

Mime
View raw message