hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gera Shegalov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6009) Map-only job with new-api runs wrong OutputCommitter when cleanup scheduled in a reduce slot
Date Mon, 28 Jul 2014 07:10:41 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075978#comment-14075978
] 

Gera Shegalov commented on MAPREDUCE-6009:
------------------------------------------

A workaround is to manually add mapred.reducer.new-api=true to the job conf.

> Map-only job with new-api runs wrong OutputCommitter when cleanup scheduled in a reduce
slot
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6009
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6009
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client, job submission
>    Affects Versions: 1.2.1
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>            Priority: Blocker
>         Attachments: MAPREDUCE-6009.v01-branch-1.2.patch
>
>
> In branch 1 job commit is executed in a JOB_CLEANUP task that may run in either map or
reduce slot
> in org.apache.hadoop.mapreduce.Job#setUseNewAPI there is a logic setting new-api flag
only for reduce-ful jobs.
> {code}
>     if (numReduces != 0) {
>       conf.setBooleanIfUnset("mapred.reducer.new-api",
>                              conf.get(oldReduceClass) == null);
>       ...
> {code}
> Therefore, when cleanup runs in a reduce slot, ReduceTask inits using the old API and
runs incorrect default OutputCommitter, instead of consulting OutputFormat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message