hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hitarth trivedi <t.hita...@gmail.com>
Subject Re: node manager ports during mapreduce job
Date Tue, 13 Jan 2015 13:52:34 GMT
Hi,



Yes, after 10 minutes it is expiring and relaunching and this time I can
see it is on different node manager.

Let me tell you the configuration. It has 1 resource manager talking to 4
node managers. If I only have one node manager running, everything works
fine. If I have multiple node managers running, it works only if firewall
is off on these node managers.

I have attached the logs for 2 nodemanagers running  so that it is easy for
debugging. Typical mapreduce program with single node manager or with
multiple node managers with firewall turned off, is taking about 30 sec.
The attached logs with 2 node managers took 11 min.

If all the 4 are running sometimes it takes 40 minutes or it times out
after for about 45 minutes.



Let me know what we are doing wrong.


Thanks,

Hitarth

On Sun, Jan 11, 2015 at 11:27 PM, Rohith Sharma K S <
rohithsharmaks@huawei.com> wrote:

>  Hi
>
>
>
> Could you give more information regarding problem?
>
>
>
> I did not get what do you mean by this statement
>
> >> Upon submitting the mapreduce job to the resource manager*, it is
> getting stuck while at getResources() for 10 min, timing out and then it is
> trying other node manager.*
>
> If MRAppMaster does not communicate to RM for 10 mins, RM will expire that
> applicationattempt and try to re launch it.  But you  have mentioned that
> it is trying to other node manager, which daemon is trying to other node
> manager?
>
>
>
> I suggest  you that whenever there is problem like getting stuck, take a
> thread dump using *jstack <pid>, *this would help analyzing issue faster.
>
>
>
> Any free ports i.e  1024<=x<=65365 should work fine.
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* hitarth trivedi [mailto:t.hitarth@gmail.com]
> *Sent:* 12 January 2015 07:01
> *To:* user@hadoop.apache.org
> *Subject:* node manager ports during mapreduce job
>
>
>
> Hi,
>
>
>
> We have a resource manager with 4 node managers. Upon submitting the
> mapreduce job to the resource manager, it is getting stuck while at
> getResources() for 10 min, timing out and then it is trying other node
> manager.
>
> When only one nodemanager running, everything is fine. Upon turning off
> the firewall on all node managers, everything seems working.
>
> Upon looking at the netstat, it was wide range of ports between 30000 to
> 61000 that noedmanagers/reosurcemanagers were communicating.
>
> So I opened the tcp ports in the range 30000:61000 and turned on the
> firewall. But it does not seem to work.
>
> Any idea, what needs to be done here?
>
>
>
> Thx
>
> -Hitarth
>

Mime
View raw message