hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <o...@yahoo-inc.com>
Subject Re: Java 1.5?
Date Mon, 09 Apr 2007 18:11:58 GMT

On Apr 8, 2007, at 1:48 AM, Tom White wrote:

> I think we can do a lot to improve the use of generics, particularly
> in MapReduce.
> <... use generics in interfaces ...>

I like it. I was thrown off at first because classes aren't  
specialized based on their template parameters, but specialization of  
the parent class is available.

> Reducer would be changed similarly, although I'm not sure how we could
> constrain the output types of the Mapper to be the input types of the
> Reducer. Perhaps via the JobConf?

That is easy, actually. In the JobClient, we'd just check to see if  
the types all play well together. Basically, you need:
K1,V1 -> map -> K2, V2
K2, V2 -> combiner -> K2, V2 (if used)
K2, V2 -> reduce -> K3, V3

It will be a tricky bit of specification to decide exactly what the  
right semantics are, since even with the generics, the application  
isn't required to define them. Therefore, we have 5 places where we  
could find a value for K2 (config, mapper output, combiner input,  
combiner output, or reduce input). Clearly all classes must be  
checked for consistency once Hadoop decides what the right values are  
for each type.

The other piece that this interacts with is the desire to use context  
objects in the parameter list. However, they appear to be orthogonal  
to each other.

-- Owen

View raw message