hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Watrous <dwmaill...@gmail.com>
Subject Re: Datanodes not connecting to the cluster
Date Thu, 24 Sep 2015 20:06:35 GMT
phew, I finally added the property below to yarn-site.xml

  <property>
    <name>yarn.resourcemanager.bind-host</name>
    <value>0.0.0.0</value>
  </property>

I now see the datanodes, but not at the same time. Under datanodes in
operation I see the master and either one of the other datanodes. Is that
typical behavior? Perhaps it's switching between them for redundancy?

Daniel

On Thu, Sep 24, 2015 at 2:49 PM, Daniel Watrous <dwmaillist@gmail.com>
wrote:

> I'm making a little progress here.
>
> I added the following properties to hdfs-site.xml
>
>   <property>
>     <name>dfs.namenode.rpc-bind-host</name>
>     <value>0.0.0.0</value>
>   </property>
>   <property>
>     <name>dfs.namenode.servicerpc-bind-host</name>
>     <value>0.0.0.0</value>
>   </property>
>
> I can now connect to hadoop-master:
>
> hadoop@hadoop-data1:~$ telnet hadoop-master 54310
> Trying 192.168.51.4...
> Connected to hadoop-master.
> Escape character is '^]'.
>
> BUT I'm now getting the error below. I'm confused that it's trying to
> connect to 192.168.51.1 because that's not even a valid IP in my
> installation.
>
> 2015-09-24 19:40:08,821 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
> service to hadoop-master/192.168.51.4:54310 Datanode denied communication
> with namenode because hostname cannot be resolved (ip=192.168.51.1,
> hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
> datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
> infoSecurePort=0, ipcPort=50020,
> storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
>
> Any idea what's happening here?
>
>
> On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dwmaillist@gmail.com>
> wrote:
>
>> In a further test, I tried connecting to the NameNode from hadoop-master
>> (where it's running) using both the hostname and the IP address.
>>
>> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
>> Trying 192.168.51.4...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet localhost 54310
>> Trying 127.0.0.1...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
>> Trying 127.0.1.1...
>> Connected to hadoop-master.
>> Escape character is '^]'.
>>
>>
>> As you can see the IP address or localhost connection is refused, but the
>> hostname connection succeeds. Is there some way to configure the namenode
>> to accept connections from all hosts?
>>
>> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dwmaillist@gmail.com>
>> wrote:
>>
>>> On one of the namenodes I have found the following warning:
>>>
>>> 2015-09-24 18:40:17,639 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>>> server: hadoop-master/192.168.51.4:54310
>>>
>>> On my master node I see that the process is running and has bound that
>>> port
>>>
>>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>>> COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>>> java    7480 hadoop  202u  IPv4  26931      0t0  TCP hadoop-master:54310
>>> (LISTEN)
>>> java    7480 hadoop  212u  IPv4  28758      0t0  TCP
>>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>>> java    7651 hadoop  238u  IPv4  28247      0t0  TCP
>>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>>> hadoop@hadoop-master:~$ jps
>>> 7856 SecondaryNameNode
>>> 7651 DataNode
>>> 7480 NameNode
>>> 8106 Jps
>>>
>>> I don't appear to have any firewall rules interfering with traffic
>>> vagrant@hadoop-master:~/src$ sudo iptables --list
>>> Chain INPUT (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target     prot opt source               destination
>>>
>>> The iptables --list output is identical on hadoop-data1. I also show a
>>> process attempting to connect to hadoop-master
>>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>>> COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>>> java    3823 hadoop  238u  IPv4  19304      0t0  TCP
>>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>>
>>> I am confused by the notation of hostname/IP:port.
>>> All help appreciated.
>>>
>>> Daniel
>>>
>>>
>>>
>>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dwmaillist@gmail.com>
>>> wrote:
>>>
>>>> I have a multi-node cluster with two datanodes. After running
>>>> start-dfs.sh, I show the following processes running
>>>>
>>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>>> hadoop-master: 10933 DataNode
>>>> hadoop-master: 10759 NameNode
>>>> hadoop-master: 11145 SecondaryNameNode
>>>> hadoop-master: 11567 Jps
>>>> hadoop-data1: 5186 Jps
>>>> hadoop-data1: 5059 DataNode
>>>> hadoop-data2: 5180 Jps
>>>> hadoop-data2: 5053 DataNode
>>>>
>>>>
>>>> However, the other two DataNodes aren't visible.
>>>> http://screencast.com/t/icsLnXXDk
>>>>
>>>> Where can I look for clues?
>>>>
>>>
>>>
>>
>

Mime
View raw message