hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "schubert zhang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-5810) Let combiner use different comparator instead of OutputKeyComparator
Date Tue, 12 May 2009 08:46:45 GMT
Let combiner use different comparator instead of OutputKeyComparator
--------------------------------------------------------------------

                 Key: HADOOP-5810
                 URL: https://issues.apache.org/jira/browse/HADOOP-5810
             Project: Hadoop Core
          Issue Type: Wish
          Components: mapred
    Affects Versions: 0.20.0, 0.19.1
         Environment: hadoop 0.19, or 0.20
            Reporter: schubert zhang
            Priority: Minor


I have a dataset with map key="city+userid+time". The output of mapper are sorted by this
map key.

Than, I group the reduce output according to "city+userid" by define my OutputValueGroupingComparator
which just compare "city+userid" in the mapkey. I still want the output are sorted by time
in each group. It works fine.

But to improve the performance, I want to use combiner which should also group as  "city+userid",
but sorted by "city+userid+time".

So, wish to develop a new feature to let combiner use different comparator instead of OutputKeyComparator.

For example CombinerGroupingComparator?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message