hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehul Sutariya <mehulgsutar...@gmail.com>
Subject Writing different output types from map.
Date Thu, 10 Dec 2009 09:07:05 GMT
Hello everyone,

I am trying to write different output types from mapper. For example, say I
have 3 classes A,B and C such that:

- C extends A and
- B extends A

and my mapper is:
Mapper<K1, V1, K2, A>
Reducer<K2, A, K3, V3>

Now, in my map function, when I try to write a record, where the value is an
instance of B or C, Hadoop framework throws an exception because the
framework does some sort of interface checking. I understand the problem
with doing this is that there would be no way for the reducer to know what
specialized type of instance it is, when it is reading the records that map
stage wrote using the readFields() method and hence I get a Type mismatch

Has anyone felt the need of doing that, or is there a workaround for such
type of operations? I would like to know possible alternatives if any.

Right now, as a workaround, I am writing converting my records and writing
it as text as the output of map phase and then parsing the record again in
reduce phase to generate the objects and then finally write the appropriate


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message