hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ed Kohlwey (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1216) MRUnit Should Sort Reduce Input
Date Mon, 16 Nov 2009 18:47:39 GMT
MRUnit Should Sort Reduce Input
-------------------------------

                 Key: MAPREDUCE-1216
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1216
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.1
         Environment: Cloudera Distribution for Hadoop 0.20.1 + 133
            Reporter: Ed Kohlwey


MRUnit should sort the input for a reduce task, the same way hadoop does.
This is useful if you have a reduce task that, for instance, removes duplicate key value pairs.

example:
{code:java}
class BadReducer extends Reducer{
public void reduce(...){
 Text last = new Text();
 for(Text text: values){
   if(!text.equals(last)){
     context.write(key, text);
     last.set(text);
    }
  }
 }
}
{code}

{code:java}
ReduceDriver driver = new ReduceDriver()
driver.setInputKey("foo");
driver.addInputValue("bar");
driver.addInputValue("bar");
driver.addInputValue("foo");
{code}
produces different results than 
{code:java}
ReduceDriver driver = new ReduceDriver()
driver.setInputKey("foo");
driver.addInputValue("bar");
driver.addInputValue("foo");
driver.addInputValue("bar");
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message