hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-686) job.setOutputValueComparatorClass(theClass) should be supported
Date Wed, 06 Jun 2007 06:19:26 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Owen O'Malley resolved HADOOP-686.

       Resolution: Duplicate
    Fix Version/s: 0.13.0
         Assignee:     (was: Owen O'Malley)

This was fixed by HADOOP-485.

> job.setOutputValueComparatorClass(theClass) should be supported
> ---------------------------------------------------------------
>                 Key: HADOOP-686
>                 URL: https://issues.apache.org/jira/browse/HADOOP-686
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>         Environment: all environment
>            Reporter: Feng Jiang
>             Fix For: 0.13.0
> if the input of Reduce phase is :
> K2, V3
> K2, V2
> K1, V5
> K1, V3
> K1, V4
> in the current hadoop, the reduce output could be:
> K1, (V5, V3, V4)
> K2, (V3, V2)
> But I hope hadoop supports job.setOutputValueComparatorClass(theClass), so that i can
make values are in order, and the output could be:
> K1, (V3, V4, V5) 
> K2, (V2, V3)
> This feature is very important, I think. Without it, we have to take the sorting by ourselves,
and have to worry about the possibility that the values are too large to fit into memory.
Then the codes becomes too hard to read. That is the reason why i think this feature is so
important, and should be done in the hadoop framework.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message