hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-1827) Reducer.reduce method's OutputCollector is too strict, it shoudn't need the key to be WritableComparable
Date Fri, 14 Dec 2007 03:23:43 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Owen O'Malley resolved HADOOP-1827.
-----------------------------------

    Resolution: Won't Fix

> Reducer.reduce method's OutputCollector is too strict, it shoudn't need the key to be
WritableComparable
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1827
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Arun C Murthy
>
> The output of the {{Reducer}}'s reduce method is *not* sorted, hence the {{OutputCollector}}
passed to it shouldn't require the *key* to be {{WritableComparable}}; passing a {{Writable}}
should suffice.
> Thus
> {code: title=Reducer.java}
> public interface Reducer<K2 extends WritableComparable, V2 extends Writable, 
>                          K3 extends WritableComparable, V3 extends Writable> 
> extends JobConfigurable, Closeable {
>   void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output,
Reporter reporter) 
>   throws IOException;
> }
> {code}
> should, technically, be:
> {code: title=Reducer.java}
> public interface Reducer<K2 extends WritableComparable, V2 extends Writable, 
>                          K3 extends Writable, V3 extends Writable> 
> extends JobConfigurable, Closeable {
>   void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output,
Reporter reporter) 
>   throws IOException;
> }
> {code}
> Pros:
> It removes an artificial limitation where it forces applications to emit <{{WritableComparable}},
{{Writable}}> pair, rather than a <{{Writable}}, {{Writable}}> pair, there-by easing
some applications (I ran into a few recently... admittedly trivial ones).
> Cons:
> 1. We now need a separate {{Combiner}} interface, since the combiner's {{OutputCollector}}
*needs* to be able to sort keys, hence requires a {{WritableComparable}} - same as the {{Mapper}}.
> 2. We need a separate {{SortableOutputCollector}} (for {{Mapper}}/{{Combiner}}) and a
{{NonSortableOutputCollector}} (for {{Reducer}}).
> 3. Alas! As a consequence of (1) & (2)we cannot use the same class as both a {{Reducer}}
and {{Combiner}} anymore, a serious compatibility issue.
> The purpose of this issue is two-fold:
> 1. Spark a discussion among folks, both hadoop-dev & hadoop-users, to figure if this
really is a problem i.e. do folks really care about this anomaly in the existing {{Reducer}}
interface? Also, is it worth the pain (@see 'Cons') to go fix it.
> 2. Even if we decide to live with it, this issue could record for posterity why we love
hadoop, warts and all. *smile*
> Lets discuss...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message