hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1065) Modify the mapred tutorial documentation to use new mapreduce api.
Date Tue, 23 Mar 2010 01:21:31 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Douglas updated MAPREDUCE-1065:
-------------------------------------

    Status: Open  (was: Patch Available)

Thanks for taking on this task. A few points that should be updated with this, but are not
necessarily related to the new API follow. A full edit (making "how many {maps,reduces}" guidance
more helpful, etc.) can be part of a separate issue, but it would be nice to correct flagrant
errors.
* This has not been true for some time (0.17?):
{quote}The key and value classes have to be serializable by the framework and hence need to
implement the Writable interface. Additionally, the key classes have to implement the  WritableComparable
interface to facilitate sorting by the framework.{quote}
* It should be noted that *combine* can run zero or more times in the "Inputs and Outputs"
section (*combine\** may be sufficient)
* Though probably obvious, it may be helpful to note what grouping will occur without a grouping
comparator in the {{Mapper}} subsection
* Removing "then" is more accurate in this sentence:
{quote}The Mapper outputs are sorted and -then- partitioned per Reducer.{quote}
* The link here:
{quote}While some job parameters are straight-forward to set (e.g. setNumReduceTasks(int)),
other parameters interact subtly with the rest of the framework and/or job configuration and
are more complex to set (e.g. setNumMapTasks(int)). {quote}
is broken (there is no {{Job::setNumMapTasks}})
* In the _Map Parameters_ section, the reference to {{io.sort.buffer.spill.percent}} should
be {{mapreduce.map.sort.spill.percent}}

> Modify the mapred tutorial documentation to use new mapreduce api.
> ------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1065
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1065
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 0.21.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Aaron Kimball
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: MAPREDUCE-1065.2.patch, MAPREDUCE-1065.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message