pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-5106) Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true
Date Fri, 13 Jan 2017 23:45:26 GMT

     [ https://issues.apache.org/jira/browse/PIG-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohini Palaniswamy updated PIG-5106:
------------------------------------
    Labels: newbie  (was: )

> Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true
> -----------------------------------------------------------------------------
>
>                 Key: PIG-5106
>                 URL: https://issues.apache.org/jira/browse/PIG-5106
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>              Labels: newbie
>             Fix For: 0.17.0
>
>
> Many of our classes extending InputFormat have
> {code}
> /*
>      * This is to support multi-level/recursive directory listing until
>      * MAPREDUCE-1577 is fixed.
>      */
>     @Override
>     protected List<FileStatus> listStatus(JobContext job) throws IOException {
      
>         return MapRedUtil.getAllFileRecursively(super.listStatus(job),
>                 job.getConfiguration());            
>     }
> {code}
> Now that we have dropped Hadoop 1.x, it can be optimized to 
> {code}
> if (getInputDirRecursive(job)) {
>             return super.listStatus(job);
>         } else {
>             /*
>              *  mapreduce.input.fileinputformat.input.dir.recursive is not true
>              *  by default for backward compatibility reasons.
>              */
>             return MapRedUtil.getAllFileRecursively(super.listStatus(job), 
>                 job.getConfiguration());     
>         }
> {code}
> That would avoid one extra iteration when  mapreduce.input.fileinputformat.input.dir.recursive
is set to true by users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message