hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shilo Ayalon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
Date Thu, 19 Apr 2012 07:21:48 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257331#comment-13257331
] 

Shilo Ayalon commented on MAPREDUCE-3193:
-----------------------------------------

I'm having the same problem with hadoop-1.0.2. Given the following directory structure (in
hdfs):

{code}
one/
├── three/
│   └── four/
│       ├── baz.txt
│       ├── bleh.txt
│       └── foo.txt
└── two/
    ├── bar.txt
    └── gaa.txt
{code}


As no recursive path support is available, I'm walking the root folder and adding all subdirs
to the job. However, adding file-less folders like {{one}} and {{one/three}} as input paths
to the job raises this exception:
{code}
java.io.FileNotFoundException: /user/hduser/data/one (Is a directory)
{code}

The actual number of files present is massive (20k/30k+), so passing all on the command line
seems redundant. Will this patch be added to the source at some point?

                
> FileInputFormat doesn't read files recursively in the input path dir
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3193
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, mrv2
>    Affects Versions: 1.0.2, 2.0.0, 3.0.0
>            Reporter: Ramgopal N
>            Assignee: Devaraj K
>         Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch,
MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch
>
>
> java.io.FileNotFoundException is thrown,if input file is more than one folder level deep
and the job is getting failed.
> Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message