hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4323) NM leaks sockets
Date Thu, 07 Jun 2012 15:40:23 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291078#comment-13291078

Daryn Sharp commented on MAPREDUCE-4323:

In particular, {{DFSClient}} maintains a socket cache.  Closed sockets are not detected until
another connection is needed, or the client is closed.  That's another issue, but the NM's
failure to close filesystems for a user after the app completes causes a leak of sockets in
the CLOSE_WAIT state that eventually exhaust fds for the process.

Calling {{FileSystem.closeAllForUGI}}, as the JT does, is troublesome that it may close the
fs for other apps running as that user.  One approach is to partition the fs cache to allow
each app to maintain its own cache of filesystems.  See HADOOP-8490 for possible approaches,
which would allow the closing of the app's filesystems ala the JT.

Also note that failure to close filesystems causes all future jobs to use the configuration
of the first job.  This will be very problematic, so it's imperative to ensure apps each get
their own cached instances.
> NM leaks sockets
> ----------------
>                 Key: MAPREDUCE-4323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Priority: Critical
> The NM is exhausting its fds because it's not closing fs instances when the app is finished.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message