hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Babu <manoj...@gmail.com>
Subject Re: Too many open files error with YARN
Date Thu, 21 Mar 2013 08:25:06 GMT
In the mean time you can quickly compare the source of the class
with provided patch in the bug.

Cheers!
Manoj.


On Thu, Mar 21, 2013 at 12:13 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Hemanth & Sandy,
>
>   Thanks for your reply. Yes, that indicates it is in close wait state,
> exactly like below:
>
> java      30718     dsadm  200u     IPv4         1178376459      0t0
>  TCP *:50010 (LISTEN)
> java      31512     dsadm  240u     IPv6         1178391921      0t0
>  TCP node1:51342->node1:50010 (CLOSE_WAIT)
>
> I just checked in at the link
> https://issues.apache.org/jira/browse/HDFS-3357 it shows 2.0.0-alpha both
> in affect versions and fix versions.
>
> There is another bug 3591, at
> https://issues.apache.org/jira/browse/HDFS-3591
>
> which says it is for backporting 3357 to branch 0.23
>
> So, I don't understand whether the fix is really in 2.0.0-alpha, request
> you to please clarify me.
>
> Thanks,
> Kishore
>
>
>
>
>
> On Thu, Mar 21, 2013 at 9:57 AM, Hemanth Yamijala <
> yhemanth@thoughtworks.com> wrote:
>
>> There was an issue related to hung connections (HDFS-3357). But the JIRA
>> indicates the fix is available in Hadoop-2.0.0-alpha. Still, would be worth
>> checking on Sandy's suggestion
>>
>>
>> On Wed, Mar 20, 2013 at 11:09 PM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>>
>>> Hi Kishore,
>>>
>>> 50010 is the datanode port. Does your lsof indicate that the sockets are
>>> in CLOSE_WAIT?  I had come across an issue like this where that was a
>>> symptom.
>>>
>>> -Sandy
>>>
>>>
>>> On Wed, Mar 20, 2013 at 4:24 AM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>  I am running a date command with YARN's distributed shell example in a
>>>> loop of 1000 times in this way:
>>>>
>>>> yarn jar
>>>> /home/kbonagir/yarn/hadoop-2.0.0-alpha/share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-2.0.0-alpha.jar
>>>> org.apache.hadoop.yarn.applications.distributedshell.Client --jar
>>>> /home/kbonagir/yarn/hadoop-2.0.0-alpha/share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-2.0.0-alpha.jar
>>>> --shell_command date --num_containers 2
>>>>
>>>>
>>>> Around 730th time or so, I am getting an error in node manager's log
>>>> saying that it failed to launch container because there are "Too many open
>>>> files" and when I observe through lsof command,I find that there is one
>>>> instance of this kind of file is left for each run of Application Master,
>>>> and it kept growing as I am running it in loop.
>>>>
>>>> node1:44871->node1:50010
>>>>
>>>> Is this a known issue? Or am I missing doing something? Please help.
>>>>
>>>> Note: I am working on hadoop--2.0.0-alpha
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Mime
View raw message