hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7352) Contracts of LocalFileSystem and DistributedFileSystem should require FileSystem::listStatus throw IOException not return null upon access error
Date Thu, 02 Jun 2011 06:18:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042622#comment-13042622
] 

Matt Foley commented on HADOOP-7352:
------------------------------------

This behavior, of returning null on access error, is following the pattern of File::list and
File::listFiles, which are defined by the JDK.  However, those cause NPE in a lot of Hadoop
code too, which is why they've recently been fixed by HADOOP-7342, HADOOP-7322, HDFS-1934,
and HDFS-2019.  Those patches provided FileUtil alternatives to the JDK methods.  That approach
isn't applicable to FileSystem::listStatus because it is an abstract method in an important
contract class defined by Hadoop.  It has a massive number of callers in Common and Mapred
(although I found none in HDFS).

I believe this change in semantics, to throw IOException instead of returning null, would
have no negative impact.  I've examined the 36 non-trivial callers in Common, and only 2 checked
for null result.  All the others would definitely get NPE on null result, and the two that
did check had faulty logic!  In Mapred there are far more callers, but almost all of them
also will throw NPE on null result.  In going through about half of them, I found one correct
and one incorrect null check in about 20 callers.

Therefore, I believe there is no downside to changing the semantics of the base method in
this way, and we'll get rid of some mystery NPEs.  We will need to scan for any callers that
check for null and use it as a non-error condition; there won't be many but there will be
a couple.

Please give feedback.  If this is acceptable I will provide a patch for each of the underlying
FileSystem.listStatus implementations, and do the careful scan for callers that actually expect
null as an allowed result.

> Contracts of LocalFileSystem and DistributedFileSystem should require FileSystem::listStatus
throw IOException not return null upon access error
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7352
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7352
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3
>            Reporter: Matt Foley
>            Assignee: Matt Foley
>
> In HADOOP-6201 and HDFS-538 it was agreed that FileSystem::listStatus should throw FileNotFoundException
instead of returning null, when the target directory did not exist.
> However, in LocalFileSystem implementation today, FileSystem::listStatus still may return
null, when the target directory exists but does not grant read permission.  This causes NPE
in many callers, for all the reasons cited in HADOOP-6201 and HDFS-538.  See HADOOP-7327 and
its linked issues for examples.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message