hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1231) Add generics to Mapper and Reducer interfaces
Date Mon, 16 Jul 2007 21:17:06 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom White updated HADOOP-1231:
------------------------------

    Attachment: MapReduceTypes.html

Due to the problems with erasure mentioned above I don't think we can generify JobConf. This
means that the compile-time type-safety checking is lost. However, Map Reduce applications
are still clearer as the types are explicit so casts aren't needed, and some runtime checking
will be supported.

There are 6 type parameters: K1, V1, K2, V2, K3, V3, related in the familiar Map Reduce way:

{noformat}
map: (K1, V1) -> list(K2, V2)
reduce: (K2, list(V2)) -> list(K3, V3)
{noformat}

I have attached a table which shows which configuration properties are constrained by which
types.

This picture is further complicated by the fact that it is not possible to always infer type
parameters at runtime - the erasure problem (so e.g. we can't infer the key type for LongSumReducer).

The fact that the configuration properties are constrained in complex ways and the effect
of erasure mean it's hard to devise simple rules for users to figure out how types in their
jobs would be inferred. So I don't think we should try to infer the types for a job, rather
we should only check them for consistency (at runtime).

Furthermore, I propose doing this consistency checking as a separate Jira, leaving this one
to deal with generifying the Map Reduce public API (which in itself is quite a big change).

Thoughts? 

> Add generics to Mapper and Reducer interfaces
> ---------------------------------------------
>
>                 Key: HADOOP-1231
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1231
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Tom White
>         Attachments: HADOOP-1231.patch, MapReduceTypes.html
>
>
> By making the input and output types of the Mapper and Reducers generic, we can get the
information from the classes and not require the user to set them in the configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message