hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3130) Output files are not moved to job output directory after task completion
Date Fri, 25 May 2012 11:56:23 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283322#comment-13283322
] 

Harsh J commented on MAPREDUCE-3130:
------------------------------------

One workaround to use when wanting a NullOutputFormat without this kinda issues (i.e. one
that works with MultipleOutputs and such) is to subclass FileOutputFormat and dumb it down:

{code}
public static class DummyOutputFormat<K,V> extends FileOutputFormat<K,V> {
    @Override
    public void checkOutputSpecs(JobContext job)
        throws FileAlreadyExistsException, IOException {
        // Do not check for existing out-dir.
    }

    @Override
    public RecordWriter<K,V>
        getRecordWriter(TaskAttemptContext context) {
            return new RecordWriter<K,V>(){
                public void write(K key, V value) {}
                public void close(TaskAttemptContext context) {}
            };
    }
}
{code}
                
> Output files are not moved to job output directory after task completion
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3130
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3130
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.20.204.0
>         Environment: NA
>            Reporter: Bhallamudi Venkata Siva Kamesh
>
> When MultipleOutputs (OutputFormat set to TextOutputFormat) is used to write output files
to different namedOutputFiles, turned off normal output file generation(i.e.*part-r-<#reduce>*)
by setting OutputFormat as NullOutputFormat, task output is not moved to job output directory
({color:blue}${mapred.output.dir}{color}).
>  
> After the job completion, found output files in {color:red}${mapred.output.dir}/_temporary/_attempt-<jobid>-<#reduce>{color}
(task output directory) rather than in the {color:blue}${mapred.output.dir}{color} (job output
dir)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message