hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7973) FileSystem close has severe consequences
Date Fri, 13 Jan 2012 16:56:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185669#comment-13185669
] 

Daryn Sharp commented on HADOOP-7973:
-------------------------------------


We are seeing two specific failure cases with the same cause:
* A MR task that uses {{FsShell}}.  The shell opens a DFS, performs it's action, and the shell
will close the DFS.  Now the MR input stream close to that same fileystem will fail.
* User map task code that opens the default filesystem and subsequently closes it.  MR input
stream close will fail.

The problem is being seen with oozie jobs, but is not unique to oozie.  If the MR tasks opens
the input/output streams with a DFS lacking a port number, then it gets a different instance
of the filesystem than user code which gets the default filesystem via {{fs.default.name}}
which does include the port number.  Effectively, the issue is hidden, and arguably it's a
bug that getting a filesystem with and without the default port returns different filesystem
instances.

There are 3 approaches that can be taken:
# {{FsShell#close}} will be a no-op
# Closing a read stream will not generate an exception if the {{DFSClient}} is closed.
# {{DistributedFileSystem#close}} becomes a no-op.  The finalizer will close the {{DFSClient}}.

#1 & #2 are simply workarounds for specific use-cases.  The problem can still happen if
user code or libraries get a filesystem and close it.

#3 is a more comprehensive solution since a decision was made on an earlier jira to not add
reference counting to cached filesystem objects.

I'll post a patch for #3.  Please provide comments if there are superior solutions.
                
> FileSystem close has severe consequences
> ----------------------------------------
>
>                 Key: HADOOP-7973
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7973
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 1.0.0
>            Reporter: Daryn Sharp
>            Priority: Blocker
>
> The way {{FileSystem#close}} works is very problematic.  Since the {{FileSystems}} are
cached, any {{close}} by any caller will cause problems for every other reference to it. 
Will add more detail in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message