hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Danny Leshem (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1574) Combiners should implement a specialized "Combiner" interface, not the generic "Reducer" interface
Date Mon, 08 Mar 2010 12:51:27 GMT
Combiners should implement a specialized "Combiner" interface, not the generic "Reducer" interface

                 Key: MAPREDUCE-1574
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1574
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
    Affects Versions: 0.20.1
            Reporter: Danny Leshem
            Priority: Minor

I just spent 30 minutes trying to figure out why my job throws "java.io.IOException: wrong
key class" when I pass my Reducer class to Job.setCombinerClass. Finally, I understood that
a Reducer can act as Combiner only if its output key/value are the same as its input key/value.

So yes, this is documented. But you can make life easier for users by defining a Combiner
interface (that Job.setCombinerClass will accept) to force this at compile time. The new interface
should implement the Reducer interface and specialize it (is it even possible with generics?).
Alternatively, you can call this interface "SimpleReducer".

If the generics-trick suggested above is impossible to implement, for the (common?) case of
having the same class acting as Combiner and Reducer you can do one of either:
1) Thin Combiner implementation that wraps a given Reducer.
2) Add a new method, say Job.setCombinerClassToReducer (that accepts a Reducer), acting similarly
to the new Job.setCombinerClass - but here the name should alert the user she's doing something

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message