hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dere (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-10340) FileInputFormat.listStatus() including directories in its results
Date Wed, 12 Feb 2014 19:55:25 GMT
Jason Dere created HADOOP-10340:
-----------------------------------

             Summary: FileInputFormat.listStatus() including directories in its results
                 Key: HADOOP-10340
                 URL: https://issues.apache.org/jira/browse/HADOOP-10340
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Jason Dere


Trying to track down HIVE-6401, where we see some "is not a file" errors because getSplits()
is giving us directories.  I believe the culprit is FileInputFormat.listStatus():

{code}
                if (recursive && stat.isDirectory()) {
                  addInputPathRecursively(result, fs, stat.getPath(),
                      inputFilter);
                } else {
                  result.add(stat);
                }
{code}

Which seems to be allowing directories to be added to the results if recursive is false. 
Is this meant to return directories? If not, I think it should look like this:

{code}
                if (stat.isDirectory()) {
                 if (recursive) {
                  addInputPathRecursively(result, fs, stat.getPath(),
                      inputFilter);
                 }
                } else {
                  result.add(stat);
                }
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message