hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clarence Gardner (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-2064) Tutorial should mention SetMapOutputKeyClass
Date Sun, 12 Sep 2010 20:59:33 GMT
Tutorial should mention SetMapOutputKeyClass

                 Key: MAPREDUCE-2064
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2064
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: documentation
    Affects Versions: 0.21.0
            Reporter: Clarence Gardner
            Priority: Minor

The official tutorial (mapred_tutorial.html) (and all other tutorials I've seen on the web)
show a program that has the same datatypes for the key/value pairs emitted by the mapper and
by the reducer, and shows a configuration call to Job.setOutput{Key,Value}Class but doesn't
say that it refers to both the mapper and the reducer. It sounds like it refers to the reducer
output. This might be mentioned in the "Job Configuration" section. Here is a possible addition,
after the "The Job is used to specify ..." paragraph.

The job also configures the types of its key/value pairs with setOutputKeyClass(type) andsetOutputValueClass(type),
which appy to both the mapper and reducer classes. If the types output by the mapper and reducer
are not the same, that should be followed with setMapOutputKeyClass(type) and setMapOutputValueClass(type).

(I'm assuming that at least a call to setOutput{Key,Value}Class is required.)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message