hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj K (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
Date Tue, 18 Oct 2011 10:04:10 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129630#comment-13129630
] 

Devaraj K commented on MAPREDUCE-3193:
--------------------------------------

Here is the problem. In FileInputFormat.listStatus, It considers the files/directories in
one nested level and takes every thing as file. Finally it creates splits with directories
and fails the task.

{code:title=FileInputFormat.java|borderStyle=solid}
 for (int i=0; i < dirs.length; ++i) {
      Path p = dirs[i];
      FileSystem fs = p.getFileSystem(job.getConfiguration()); 
      FileStatus[] matches = fs.globStatus(p, inputFilter);
      if (matches == null) {
        errors.add(new IOException("Input path does not exist: " + p));
      } else if (matches.length == 0) {
        errors.add(new IOException("Input Pattern " + p + " matches 0 files"));
      } else {
        for (FileStatus globStat: matches) {
          if (globStat.isDirectory()) {
            for(FileStatus stat: fs.listStatus(globStat.getPath(),
                inputFilter)) {
              result.add(stat);
            }          
          } else {
            result.add(globStat);
          }
        }
      }
    }
{code}

                
> NextGen Mapreduce framework is not able to read the job input recursively.Input is read
only for one folder level deep
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3193
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ramgopal N
>
> java.io.FileNotFoundException is thrown,if input file is more than one folder level deep
and the job is getting failed.
> Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message