hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj K (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4340) Node Manager leaks socket connections connected to Data Node
Date Thu, 14 Jun 2012 10:11:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294938#comment-13294938

Devaraj K commented on MAPREDUCE-4340:

I have investigated some things for this issue, it seems to be due to FileSystem Cache.

The Node Manager gets the FileSystem object for copying the files from DFS during localization.


  private Path copy(Path sCopy, Path dstdir) throws IOException {
    FileSystem sourceFs = sCopy.getFileSystem(conf);
    Path dCopy = new Path(dstdir, sCopy.getName() + ".tmp");
    FileStatus sStat = sourceFs.getFileStatus(sCopy);
    if (sStat.getModificationTime() != resource.getTimestamp()) {
      throw new IOException("Resource " + sCopy +
          " changed on src filesystem (expected " + resource.getTimestamp() +
          ", was " + sStat.getModificationTime());

    sourceFs.copyToLocalFile(sCopy, dCopy);
    return dCopy;

It is using the FileSystem.get(URI uri, Configuration conf) API to get file system instance,
and it internally uses cache for file system instances. For next job, FileSystem.Cache.Key
is not matching with previous instance key, creating new file system instance again and it
is keep on increasing for every job. For every file system instance there is associated DFSClient
instance which is holding the datanode socket in socketCache and it is not closing by any
> Node Manager leaks socket connections connected to Data Node
> ------------------------------------------------------------
>                 Key: MAPREDUCE-4340
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4340
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Critical
> I am running simple wordcount example with default configurations, for every job run
it increases one datanode socket connection and it will be there in CLOSE_WAIT state forever.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message