hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-2064) Tutorial should mention SetMapOutputKeyClass
Date Thu, 11 Feb 2016 22:51:18 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Allen Wittenauer updated MAPREDUCE-2064:
    Fix Version/s:     (was: 1.0.4)

> Tutorial should mention SetMapOutputKeyClass
> --------------------------------------------
>                 Key: MAPREDUCE-2064
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2064
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.21.0
>            Reporter: Clarence Gardner
>            Priority: Minor
>              Labels: newbie
> The official tutorial (mapred_tutorial.html) (and all other tutorials I've seen on the
web) show a program that has the same datatypes for the key/value pairs emitted by the mapper
and by the reducer, and shows a configuration call to Job.setOutput{Key,Value}Class but doesn't
say that it refers to both the mapper and the reducer. It sounds like it refers to the reducer
output. This might be mentioned in the "Job Configuration" section. Here is a possible addition,
after the "The Job is used to specify ..." paragraph.
> The job also configures the types of its key/value pairs with setOutputKeyClass(type)
andsetOutputValueClass(type), which appy to both the mapper and reducer classes. If the types
output by the mapper and reducer are not the same, that should be followed with setMapOutputKeyClass(type)
and setMapOutputValueClass(type).
> (I'm assuming that at least a call to setOutput{Key,Value}Class is required.)

This message was sent by Atlassian JIRA

View raw message