hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Sreekumar <hsreeku...@clickable.com>
Subject Re: Using MultipleTextOutputFormat for map-only jobs
Date Thu, 14 Apr 2011 13:07:57 GMT
That is exactly what I do when I have a reduce phase, and it works. But in
case of map-only jobs, it doesn't work. I'll try overriding the
getOutputfileFromInputFile() method.

On Thu, Apr 14, 2011 at 5:19 PM, Harsh J <harsh@cloudera.com> wrote:

> Hello again Hari,
>
> On Thu, Apr 14, 2011 at 5:10 PM, Hari Sreekumar
> <hsreekumar@clickable.com> wrote:
> > Here is a part of the code I am using:
> >     jobConf.setOutputFormat(MultipleTextOutputFormat.class);
>
> You need to subclass the OF and use it properly, else the abstract
> class takes over with the default name always used (Thus, 'part'). You
> can see a good, complete example at [1].
>
> I'd still recommend using MultipleOutputs for better portability
> reasons. Its javadocs explain how to go about using it well enough
> [2].
>
> [1] -
> https://sites.google.com/site/hadoopandhive/home/how-to-write-output-to-multiple-named-files-in-hadoop-using-multipletextoutputformat
> [2] -
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
>
> --
> Harsh J
>

Mime
View raw message