hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3130) Output files are not moved to job output directory after task completion
Date Fri, 25 May 2012 12:06:23 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283326#comment-13283326

Harsh J commented on MAPREDUCE-3130:

We could do one of these few things to address this:

1. In new API MultipleOutputs class, we may document to not use NullOutputFormat but instead
use plain FileOutputFormat.
2. In new API FileOutputFormat, we can change the behavior of FileOutputFormat to never create
blank empty files. That is, open a file if if and only if at least one KV is passed to the
RecordReader, not upon construction of RecordReader.
3. The NullOutputFormat should instead derive out of FileOutputFormat with a blank record
reader as shown above. This wouldn't cause any issues (would be similar to a job that hasn't
produced any output at all, but without empty files).

I'd prefer (3) or (2) - in that order. Thoughts?
> Output files are not moved to job output directory after task completion
> ------------------------------------------------------------------------
>                 Key: MAPREDUCE-3130
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3130
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions:
>         Environment: NA
>            Reporter: Bhallamudi Venkata Siva Kamesh
> When MultipleOutputs (OutputFormat set to TextOutputFormat) is used to write output files
to different namedOutputFiles, turned off normal output file generation(i.e.*part-r-<#reduce>*)
by setting OutputFormat as NullOutputFormat, task output is not moved to job output directory
> After the job completion, found output files in {color:red}${mapred.output.dir}/_temporary/_attempt-<jobid>-<#reduce>{color}
(task output directory) rather than in the {color:blue}${mapred.output.dir}{color} (job output

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message