hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0
Date Thu, 07 Mar 2013 19:18:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596218#comment-13596218

Robert Joseph Evans commented on MAPREDUCE-5051:

Damien,  The combiner only runs as part of the shuffle phase.  The shuffle phase only runs
when there is a reducer that needs the data to be shuffled.  So your indexing works just fine
if all of the indexes for a given key are not in the same file?

If you want just a combiner to run with no reducers configured, you are going to have to write
something for that yourself.
> Combiner not used when NUM_REDUCES=0
> ------------------------------------
>                 Key: MAPREDUCE-5051
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 2.0.2-alpha
>         Environment: CDH4.1.2 MR1
>            Reporter: Damien Hardy
> We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer :
> Bulk indexing of HBase data in ElasticSearch,
> Map output is K / V : #bulk / json_data_to_be_indexed.
> So job is launched maps work, combiners index and a reducer is created for nothing (sometimes
waiting for other M/R job to free a tasktracker slot for reducer cf. MAPREDUCE-5019 )
> When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are started but
combiner are not used.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message