hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi" <runp...@yahoo-inc.com>
Subject RE: Different Key/Value classes for Map and Reduce?
Date Fri, 31 Mar 2006 15:21:34 GMT

A simple fix is to add another two attributes to JobConf class:
mapOutputLeyClass and mapOutputValueClass. That allows the user to have
different key/value classes for the intermediate and final outputs.

I'll file a bug for this problem.


-----Original Message-----
From: Darek Zbik [mailto:d.zbik@softwaremind.pl] 
Sent: Friday, March 31, 2006 4:28 AM
To: hadoop-dev@lucene.apache.org
Subject: Re: Different Key/Value classes for Map and Reduce?

Runping Qi wrote:

>When the reducers write the final results out, its output format is
>from the job object. By default, it is TextOutputFormat, and no conflicts.
>However, if one wants to use SequencialFileFormat for the final results,
>then the key/value classes are also obtained from the job object, the same
>as the map tasks' output. Now we have a problem. It is impossible for the
>map outputs and reducer outputs use different key/value classes, if one
>wants the reducers generate outputs in SequentialFileFormat.
I have this problem in real situation. I solve it by creating my own output
format which is in fact copy-paste of the SequentialFileFormat with small
changes (simply a took output class from ohter (my own) job property). I 
that each hadoop job shoud have posibility to denote output key/value from
reduce task (eg. {set,get}ReducerOutput{Key,Value}).


View raw message