giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bojan Babic <gba...@gmail.com>
Subject Re: Issue with Giraph on multinode cluster
Date Sat, 18 Oct 2014 00:52:55 GMT
I'm using giraph 1.1.0-SNAPSHOT for hadoop 1.2.1

On Fri, Oct 17, 2014 at 4:01 PM, Bojan Babic <gbabun@gmail.com> wrote:

> Hi guys,
>
> I'm risking to post issue that has been already issued, but I'll take risk
> to be ridiculed :)
>
> I have small hadoop cluster on Digital Ocean (1 master  4 nodes). I was
> able to setup cluster and run word count example as well as single node
> sample from Quick start.
>
> As I introduce more nodes into play, I get issue where Task Tracker spawns
> Child process
>
> hduser@hdnode-2:~# jps
>> 13839 TaskTracker
>> 13697 DataNode
>> 14067 Jps
>> 13962 Child
>
> *13961 Child*
>
>
> that listen on looback interface
>
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
>>       User       Inode       PID/Program name
>> tcp        0      0 127.0.0.1:1337          0.0.0.0:*
>> LISTEN      root       21544925    29912/python
>> tcp        0      0 0.0.0.0:50010           0.0.0.0:*
>> LISTEN      hduser     21691552    13697/java
>> tcp        0      0 127.0.0.1:30011         0.0.0.0:*
>> LISTEN      hduser     21693578    13962/java
>> tcp        0      0 0.0.0.0:50075           0.0.0.0:*
>> LISTEN      hduser     21691554    13697/java
>> tcp        0      0 0.0.0.0:50020           0.0.0.0:*
>> LISTEN      hduser     21691557    13697/java
>> tcp        0      0 127.0.0.1:50118         0.0.0.0:*
>> LISTEN      hduser     21691870    13839/java
>> tcp        0      0 0.0.0.0:41640           0.0.0.0:*
>> LISTEN      hduser     21691296    13697/java
>> tcp        0      0 127.0.0.1:31337         0.0.0.0:*
>> LISTEN      root       20432660    1514/python
>> tcp        0      0 0.0.0.0:50060           0.0.0.0:*
>> LISTEN      hduser     21692144    13839/java
>> tcp        0      0 0.0.0.0:http-alt        0.0.0.0:*
>> LISTEN      root       20431897    1421/python
>>
>>
>> *tcp        0      0 127.0.0.1:30001 <http://127.0.0.1:30001/>
>> 0.0.0.0:*               LISTEN      hduser     21370004    7856/ssh
>>  tcp        0      0 127.0.0.1:30003 <http://127.0.0.1:30003/>
>> 0.0.0.0:*               LISTEN      hduser     21693562    13961/java      *tcp
>>       0      0 127.0.0.1:58741         0.0.0.0:*               LISTEN
>>   hduser     21370000    7856/ssh
>> tcp        0      0 127.0.0.1:58742         0.0.0.0:*
>> LISTEN      hduser     21369982    7845/autossh
>> tcp        0      0 0.0.0.0:ssh             0.0.0.0:*
>> LISTEN      root       9130        834/sshd
>> tcp6       0      0 ::1:30001               :::*
>> LISTEN      hduser     21370003    7856/ssh
>> tcp6       0      0 ::1:58741               :::*
>> LISTEN      hduser     21369999    7856/ssh
>> tcp6       0      0 :::ssh                  :::*
>> LISTEN      root       9165        834/sshd
>
>
> instead of all interfaces (0.0.0.0)
>
> This results in node being unreachable from other nodes. ie hdnode02:
>
>>
>> 2014-10-17 14:10:31,146 WARN org.apache.giraph.comm.netty.NettyClient:
>> 2014-10-17 14:10:31,159 WARN org.apache.giraph.comm.netty.NettyClient:
>> connectAllAddresses: Future failed to connect with
>> hdnode-2/XXX.XXX.XXX.XXX:30003 with 1 failures because of
>> java.net.ConnectException: Connection refused:
>> *hdnode-2/XXX.XXX.XXX.XXX:30003*
>> 2014-10-17 14:10:31,159 INFO org.apache.giraph.comm.netty.NettyClient:
>> connectAllAddresses: Successfully added 1 connections, (1 total connected)
>> 2 failed, 2 failures total.
>
>
> If I stop all processes and start nc on 30003, I can telnet to hdnode2.
>
> Question here is if there is any setup that will configure Child process
> to listen on 0.0.0.0 instead of loopback interface?
>
> Thanks in advance
>
>


-- 
--------------------------------
Bojan Babic, M.Sc.E.E
Software developer
twitter: @bojanbabic
mobile: +1312 8602944

Mime
View raw message