hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shevek (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-1230) Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat, and OutputFormat classes
Date Fri, 24 Apr 2009 20:56:30 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702547#action_12702547
] 

Shevek edited comment on HADOOP-1230 at 4/24/09 1:56 PM:
---------------------------------------------------------

You might find that unless you pass (Context, Key, Value) as parameters to map(), it is very
hard to implement ChainedMapper, since you will have to delegate an entire Context. It will
also be very hard to do the things I want to do with Hadoop. Unless I hear a good argument
otherwise, I will submit a new ticket.

If you want to lazily deserialize the values, there are a couple of options:

(a) Choice of methods on input object:
RecordInput.getKey() { return deserialize(getKeyBytes()); }
map(Context, RecordInput) { input.getKey[Bytes]()... }

(b) Choice of methods to override in Mapper:
Mapper.map(Context, byte[], byte[]) { map(ctx, deserialize(keybytes), deserialize(valuebytes));
}

      was (Author: arren):
    You might find that unless you pass (Context, Key, Value) as parameters to map(), it is
very hard to implement ChainedMapper, since you will have to delegate an entire Context. It
will also be very hard to do the things I want to do with Hadoop. Unless I hear a good argument
otherwise, I will submit a new ticket.
  
> Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat,
and OutputFormat classes
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1230
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.0
>
>         Attachments: context-objs-2.patch, context-objs-3.patch, context-objs.patch,
h1230.patch, h1230.patch, h1230.patch, h1230.patch, h1230.patch
>
>
> This is a big change, but it will future-proof our API's. To maintain backwards compatibility,
I'd suggest that we move over to a new package name (org.apache.hadoop.mapreduce) and deprecate
the old interfaces and package. Basically, it will replace:
> package org.apache.hadoop.mapred;
> public interface Mapper extends JobConfigurable, Closeable {
>   void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter)
throws IOException;
> }
> with:
> package org.apache.hadoop.mapreduce;
> public interface Mapper extends Closable {
>   void map(MapContext context) throws IOException;
> }
> where MapContext has the methods like getKey(), getValue(), collect(Key, Value), progress(),
etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message