hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Watrous <dwmaill...@gmail.com>
Subject Re: Problem running example (wrong IP address)
Date Mon, 28 Sep 2015 15:13:49 GMT
After talking with the Vagrant community I decided I was being too clever
trying to run the datanodes on a separate subnet from the master node. I
changed my configuration to have all three hosts on the same subnet and
everything works just as expected.

Thanks for all your help and input.

On Mon, Sep 28, 2015 at 9:07 AM, Daniel Watrous <dwmaillist@gmail.com>
wrote:

> Vinay,
>
> There is no gateway to 51.*. These are IP addresses that I set in my
> Vagrantfile for virtualbox as part of a private network:
> master.vm.network "private_network", ip: 192.168.51.4
>
> This allows me to spin up all the hosts for my cluster automatically and
> know that they always have the same IP addresses.
>
> From hadoop-data1 (192.168.52.4) I have unrestricted access to
> hadoop-master (192.168.51.4)
>
> hadoop@hadoop-data1:~$ ifconfig
> eth1      Link encap:Ethernet  HWaddr 08:00:27:b9:55:25
>           inet addr:192.168.52.4  Bcast:192.168.52.255  Mask:255.255.255.0
> hadoop@hadoop-data1:~$ ping hadoop-master
> PING hadoop-master (192.168.51.4) 56(84) bytes of data.
> 64 bytes from hadoop-master (192.168.51.4): icmp_seq=1 ttl=63 time=3.13 ms
> 64 bytes from hadoop-master (192.168.51.4): icmp_seq=2 ttl=63 time=2.72 ms
>
> I'm not sure I understand exactly what you're asking for, but from the
> master I can run this
>
> vagrant@hadoop-master:~$ sudo netstat -tnulp | grep 54310
> tcp        0      0 0.0.0.0:54310           0.0.0.0:*
> LISTEN      22944/java
>
> I understand what you're saying about a gateway often existing at that
> address for a subnet. I'm not familiar enough with Vagrant to answer this
> right now, but I will put in a question there.
>
> I can also change the other two IP addresses to be on the same 51. subnet.
> I may try that next.
>
>
>
> On Mon, Sep 28, 2015 at 8:33 AM, Vinayakumar B <vinayakumarb@apache.org>
> wrote:
>
>> 192.168.51.1 might be gateway to 51.* subnet right?
>>
>> Can you verify whether connections from outside 51 subnet, to 51.4
>> machine using other subnet IP as remote IP. ?
>>
>> You can create any connection, may not be namenode-datanode.
>>
>> for ex: Connection from 192.168.52.4 dn to 192.168.51.4 namenode should
>> result in following, when checked using netstat command in namenode
>> machine. "netstat -tnulp | grep <NN_RPC_PORT>"
>>
>> Output should be something like below
>>
>> tcp        0      0   192.168.51.4:54310        192.168.52.4:32567
>>     LISTEN      -
>>
>>
>> If the Foreign Ip is listing as 192.168.51.1 instead of 192.168.52.4,
>> then the gateway, is not passing original client IP forward, its
>> re-creating connections with its own IP. in such case problem will be with
>> the gateway.
>>
>> Its just a guess, reality could be different.
>>
>> please check and let me know.
>>
>> -Vinay
>>
>> On Mon, Sep 28, 2015 at 6:45 PM, Daniel Watrous <dwmaillist@gmail.com>
>> wrote:
>>
>>> Thanks to Namikaze pointing out that I should have sent the namenode log
>>> as a pastbin
>>>
>>> http://pastebin.com/u33bBbgu
>>>
>>>
>>> On Mon, Sep 28, 2015 at 8:02 AM, Daniel Watrous <dwmaillist@gmail.com>
>>> wrote:
>>>
>>>> I have posted the namenode logs here:
>>>> https://gist.github.com/dwatrous/dafaa7695698f36a5d93
>>>>
>>>> Thanks for all the help.
>>>>
>>>> On Sun, Sep 27, 2015 at 10:28 AM, Brahma Reddy Battula <
>>>> brahmareddy.battula@hotmail.com> wrote:
>>>>
>>>>> Thanks for sharing the logs.
>>>>>
>>>>> Problem is interesting..can you please post namenode logs and dual IP
>>>>> configurations(thinking problem with gateway while sending requests from
>>>>> 52.1 segment to 51.1 segment..)
>>>>>
>>>>> Thanks And Regards
>>>>> Brahma Reddy Battula
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> Date: Fri, 25 Sep 2015 12:19:00 -0500
>>>>>
>>>>> Subject: Re: Problem running example (wrong IP address)
>>>>> From: dwmaillist@gmail.com
>>>>> To: user@hadoop.apache.org
>>>>>
>>>>> hadoop-master http://pastebin.com/yVF8vCYS
>>>>> hadoop-data1 http://pastebin.com/xMEdf01e
>>>>> hadoop-data2 http://pastebin.com/prqd02eZ
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 25, 2015 at 11:53 AM, Brahma Reddy Battula <
>>>>> brahmareddy.battula@hotmail.com> wrote:
>>>>>
>>>>> sorry,I am not able to access the logs, could please post in paste bin
>>>>> or attach the 192.168.51.6( as your query is why different IP) DN
>>>>> logs and namenode logs here..?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks And Regards
>>>>> Brahma Reddy Battula
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> Date: Fri, 25 Sep 2015 11:16:55 -0500
>>>>> Subject: Re: Problem running example (wrong IP address)
>>>>> From: dwmaillist@gmail.com
>>>>> To: user@hadoop.apache.org
>>>>>
>>>>>
>>>>> Brahma,
>>>>>
>>>>> Thanks for the reply. I'll keep this conversation here in the user
>>>>> list. The /etc/hosts file is identical on all three nodes
>>>>>
>>>>> hadoop@hadoop-data1:~$ cat /etc/hosts
>>>>> 127.0.0.1 localhost
>>>>> 192.168.51.4 hadoop-master
>>>>> 192.168.52.4 hadoop-data1
>>>>> 192.168.52.6 hadoop-data2
>>>>>
>>>>> hadoop@hadoop-data2:~$ cat /etc/hosts
>>>>> 127.0.0.1 localhost
>>>>> 192.168.51.4 hadoop-master
>>>>> 192.168.52.4 hadoop-data1
>>>>> 192.168.52.6 hadoop-data2
>>>>>
>>>>> hadoop@hadoop-master:~$ cat /etc/hosts
>>>>> 127.0.0.1 localhost
>>>>> 192.168.51.4 hadoop-master
>>>>> 192.168.52.4 hadoop-data1
>>>>> 192.168.52.6 hadoop-data2
>>>>>
>>>>> Here are the startup logs for all three nodes:
>>>>> https://gist.github.com/dwatrous/7241bb804a9be8f9303f
>>>>> https://gist.github.com/dwatrous/bcd85cda23d6eca3a68b
>>>>> https://gist.github.com/dwatrous/922c4f773aded0137fa3
>>>>>
>>>>> Thanks for your help.
>>>>>
>>>>>
>>>>> On Fri, Sep 25, 2015 at 10:33 AM, Brahma Reddy Battula <
>>>>> brahmareddy.battula@huawei.com> wrote:
>>>>>
>>>>> Seems DN started in three machines and failed in
>>>>> hadoop-data1(192.168.52.4)..
>>>>>
>>>>>
>>>>> 192.168.51.6 : giving IP as 192.168.51.1 <http://192.168.51.1:50010>...can
>>>>> you please check /etc/hosts file of 192.168.51.6 (might be
>>>>> 192.168.51.1 <http://192.168.51.1:50010> is configured in /etc/hosts)
>>>>>
>>>>> 192.168.52.4 : datanode startup might be failed ( you can check this
>>>>> node logs)
>>>>>
>>>>> 192.168.51.4 :  <http://192.168.51.4:50010> Datanode starup is
>>>>> success..which is in master node..
>>>>>
>>>>>
>>>>>
>>>>> Thanks & Regards
>>>>>  Brahma Reddy Battula
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> *From:* Daniel Watrous [dwmaillist@gmail.com]
>>>>> *Sent:* Friday, September 25, 2015 8:41 PM
>>>>> *To:* user@hadoop.apache.org
>>>>> *Subject:* Re: Problem running example (wrong IP address)
>>>>>
>>>>> I'm still stuck on this and posted it to stackoverflow:
>>>>>
>>>>> http://stackoverflow.com/questions/32785256/hadoop-datanode-binds-wrong-ip-address
>>>>>
>>>>> Thanks,
>>>>> Daniel
>>>>>
>>>>> On Fri, Sep 25, 2015 at 8:28 AM, Daniel Watrous <dwmaillist@gmail.com>
>>>>> wrote:
>>>>>
>>>>> I could really use some help here. As you can see from the output
>>>>> below, the two attached datanodes are identified with a non-existent
IP
>>>>> address. Can someone tell me how that gets selected or how to explicitly
>>>>> set it. Also, why are both datanodes shown under the same name/IP?
>>>>>
>>>>> hadoop@hadoop-master:~$ hdfs dfsadmin -report
>>>>> Configured Capacity: 84482326528 (78.68 GB)
>>>>> Present Capacity: 75745546240 (70.54 GB)
>>>>> DFS Remaining: 75744862208 (70.54 GB)
>>>>> DFS Used: 684032 (668 KB)
>>>>> DFS Used%: 0.00%
>>>>> Under replicated blocks: 0
>>>>> Blocks with corrupt replicas: 0
>>>>> Missing blocks: 0
>>>>> Missing blocks (with replication factor 1): 0
>>>>>
>>>>> -------------------------------------------------
>>>>> Live datanodes (2):
>>>>>
>>>>> Name: 192.168.51.1:50010 (192.168.51.1)
>>>>> Hostname: hadoop-data1
>>>>> Decommission Status : Normal
>>>>> Configured Capacity: 42241163264 (39.34 GB)
>>>>> DFS Used: 303104 (296 KB)
>>>>> Non DFS Used: 4302479360 (4.01 GB)
>>>>> DFS Remaining: 37938380800 (35.33 GB)
>>>>> DFS Used%: 0.00%
>>>>> DFS Remaining%: 89.81%
>>>>> Configured Cache Capacity: 0 (0 B)
>>>>> Cache Used: 0 (0 B)
>>>>> Cache Remaining: 0 (0 B)
>>>>> Cache Used%: 100.00%
>>>>> Cache Remaining%: 0.00%
>>>>> Xceivers: 1
>>>>> Last contact: Fri Sep 25 13:25:37 UTC 2015
>>>>>
>>>>>
>>>>> Name: 192.168.51.4:50010 (hadoop-master)
>>>>> Hostname: hadoop-master
>>>>> Decommission Status : Normal
>>>>> Configured Capacity: 42241163264 (39.34 GB)
>>>>> DFS Used: 380928 (372 KB)
>>>>> Non DFS Used: 4434300928 (4.13 GB)
>>>>> DFS Remaining: 37806481408 (35.21 GB)
>>>>> DFS Used%: 0.00%
>>>>> DFS Remaining%: 89.50%
>>>>> Configured Cache Capacity: 0 (0 B)
>>>>> Cache Used: 0 (0 B)
>>>>> Cache Remaining: 0 (0 B)
>>>>> Cache Used%: 100.00%
>>>>> Cache Remaining%: 0.00%
>>>>> Xceivers: 1
>>>>> Last contact: Fri Sep 25 13:25:38 UTC 2015
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 24, 2015 at 5:05 PM, Daniel Watrous <dwmaillist@gmail.com>
>>>>> wrote:
>>>>>
>>>>> The IP address is clearly wrong, but I'm not sure how it gets set. Can
>>>>> someone tell me how to configure it to choose a valid IP address?
>>>>>
>>>>> On Thu, Sep 24, 2015 at 3:26 PM, Daniel Watrous <dwmaillist@gmail.com>
>>>>> wrote:
>>>>>
>>>>> I just noticed that both datanodes appear to have chosen that IP
>>>>> address and bound that port for HDFS communication.
>>>>>
>>>>> http://screencast.com/t/OQNbrWFF
>>>>>
>>>>> Any idea why this would be? Is there some way to specify which
>>>>> IP/hostname should be used for that?
>>>>>
>>>>> On Thu, Sep 24, 2015 at 3:11 PM, Daniel Watrous <dwmaillist@gmail.com>
>>>>> wrote:
>>>>>
>>>>> When I try to run a map reduce example, I get the following error:
>>>>>
>>>>> hadoop@hadoop-master:~$ hadoop jar
>>>>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>>>>> pi 10 30
>>>>> Number of Maps  = 10
>>>>> Samples per Map = 30
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Exception in
>>>>> createBlockOutputStream
>>>>> java.io.IOException: Got error, status message , ack with firstBadLink
>>>>> as 192.168.51.1:50010
>>>>>         at
>>>>> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1334)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Abandoning
>>>>> BP-852923283-127.0.1.1-1443119668806:blk_1073741825_1001
>>>>> 15/09/24 20:04:28 INFO hdfs.DFSClient: Excluding datanode
>>>>> DatanodeInfoWithStorage[192.168.51.1:50010
>>>>> ,DS-45f6e06d-752e-41e8-ac25-ca88bce80d00,DISK]
>>>>> 15/09/24 20:04:28 WARN hdfs.DFSClient: Slow waitForAckedSeqno took
>>>>> 65357ms (threshold=30000ms)
>>>>> Wrote input for Map #0
>>>>>
>>>>> I'm not sure why it's trying to access 192.168.51.1:50010, which
>>>>> isn't even a valid IP address in my setup.
>>>>>
>>>>> Daniel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message