hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damien Hardy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0
Date Fri, 08 Mar 2013 08:54:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596942#comment-13596942

Damien Hardy commented on MAPREDUCE-5051:

Hmm that is sad. it fit really well with our needs: Aggregating things by mapper easily for
bulk treatment that not really need global aggregation (like indexation).
Can you point me out where this is done in the code (running or not Combiner class) ?
> Combiner not used when NUM_REDUCES=0
> ------------------------------------
>                 Key: MAPREDUCE-5051
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 2.0.2-alpha
>         Environment: CDH4.1.2 MR1
>            Reporter: Damien Hardy
> We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer :
> Bulk indexing of HBase data in ElasticSearch,
> Map output is K / V : #bulk / json_data_to_be_indexed.
> So job is launched maps work, combiners index and a reducer is created for nothing (sometimes
waiting for other M/R job to free a tasktracker slot for reducer cf. MAPREDUCE-5019 )
> When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are started but
combiner are not used.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message