hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Watrous <dwmaill...@gmail.com>
Subject Re: Datanodes not connecting to the cluster
Date Thu, 24 Sep 2015 19:49:31 GMT
I'm making a little progress here.

I added the following properties to hdfs-site.xml

  <property>
    <name>dfs.namenode.rpc-bind-host</name>
    <value>0.0.0.0</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-bind-host</name>
    <value>0.0.0.0</value>
  </property>

I can now connect to hadoop-master:

hadoop@hadoop-data1:~$ telnet hadoop-master 54310
Trying 192.168.51.4...
Connected to hadoop-master.
Escape character is '^]'.

BUT I'm now getting the error below. I'm confused that it's trying to
connect to 192.168.51.1 because that's not even a valid IP in my
installation.

2015-09-24 19:40:08,821 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
service to hadoop-master/192.168.51.4:54310 Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.51.1,
hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)

Any idea what's happening here?


On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dwmaillist@gmail.com>
wrote:

> In a further test, I tried connecting to the NameNode from hadoop-master
> (where it's running) using both the hostname and the IP address.
>
> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
> Trying 192.168.51.4...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet localhost 54310
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
> Trying 127.0.1.1...
> Connected to hadoop-master.
> Escape character is '^]'.
>
>
> As you can see the IP address or localhost connection is refused, but the
> hostname connection succeeds. Is there some way to configure the namenode
> to accept connections from all hosts?
>
> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dwmaillist@gmail.com>
> wrote:
>
>> On one of the namenodes I have found the following warning:
>>
>> 2015-09-24 18:40:17,639 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>> server: hadoop-master/192.168.51.4:54310
>>
>> On my master node I see that the process is running and has bound that
>> port
>>
>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>> COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>> java    7480 hadoop  202u  IPv4  26931      0t0  TCP hadoop-master:54310
>> (LISTEN)
>> java    7480 hadoop  212u  IPv4  28758      0t0  TCP
>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>> java    7651 hadoop  238u  IPv4  28247      0t0  TCP
>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>> hadoop@hadoop-master:~$ jps
>> 7856 SecondaryNameNode
>> 7651 DataNode
>> 7480 NameNode
>> 8106 Jps
>>
>> I don't appear to have any firewall rules interfering with traffic
>> vagrant@hadoop-master:~/src$ sudo iptables --list
>> Chain INPUT (policy ACCEPT)
>> target     prot opt source               destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target     prot opt source               destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target     prot opt source               destination
>>
>> The iptables --list output is identical on hadoop-data1. I also show a
>> process attempting to connect to hadoop-master
>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>> COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>> java    3823 hadoop  238u  IPv4  19304      0t0  TCP
>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>
>> I am confused by the notation of hostname/IP:port.
>> All help appreciated.
>>
>> Daniel
>>
>>
>>
>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dwmaillist@gmail.com>
>> wrote:
>>
>>> I have a multi-node cluster with two datanodes. After running
>>> start-dfs.sh, I show the following processes running
>>>
>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>> hadoop-master: 10933 DataNode
>>> hadoop-master: 10759 NameNode
>>> hadoop-master: 11145 SecondaryNameNode
>>> hadoop-master: 11567 Jps
>>> hadoop-data1: 5186 Jps
>>> hadoop-data1: 5059 DataNode
>>> hadoop-data2: 5180 Jps
>>> hadoop-data2: 5053 DataNode
>>>
>>>
>>> However, the other two DataNodes aren't visible.
>>> http://screencast.com/t/icsLnXXDk
>>>
>>> Where can I look for clues?
>>>
>>
>>
>

Mime
View raw message