hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-115) Hadoop should allow the user to use SequentialFileOutputformat as the output format and to choose key/value classes that are different from those for map output.
Date Fri, 31 Mar 2006 15:31:40 GMT
Hadoop should allow the user to use SequentialFileOutputformat as the output format and to
choose  key/value classes that are different from those for map output. 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------

         Key: HADOOP-115
         URL: http://issues.apache.org/jira/browse/HADOOP-115
     Project: Hadoop
        Type: Improvement
  Components: mapred  
    Reporter: Runping Qi



When map tasks write intermediate data out, they always use SequencialFile RecordWriter with
key/value classes from the job object.

When the reducers write the final results out, its output format is obtained from the job
object. By default, it is TextOutputFormat, and no conflicts.
However, if one wants to use SequencialFileFormat for the final results, then the key/value
classes are also obtained from the job object, the same as the map tasks' output. Now we have
a problem. It is impossible for the map outputs and reducer outputs use different key/value
classes, if one wants the reducers generate outputs in SequentialFileFormat.

A simple fix would be to add another two attributes to JobConf class: mapOutputLeyClass and
mapOutputValueClass. That allows the user to have different key/value classes for the intermediate
and final outputs.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message