hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheetal Gosrani <sgosr...@barracuda.com>
Subject MultipleOutputs writes to only 1 output file even if configured to write to 2 output files
Date Thu, 18 Oct 2012 21:52:10 GMT

I am trying to read from cassandra and write the reducers output to multiple output files
using MultipleOutputs api. The file formats in my case are custom output formats extending
FileOutputFormat. I have configured my job in a similar manner as shown in MultipleOutputs
javadocs api: http://hadoop.apache.org/docs/r1.0.3/api/index.html

However, when I run the job, I only get one output file named part-r-0000 which is in text
output format. If job.setOutputFormatClass is not set, by default it considers TextOutputFormat
to be the format. It completely ignores the output formats I specified in MulitpleOutputs.addNamedOutput(job,
"format1", MyCustomFileFormat1.class, Text.class, Text.class) and MulitpleOutputs.addNamedOutput(job,
"format2", MyCustomFileFormat2.class, Text.class, Text.class). Is someone else facing similar
problem or am I doing something wrong ?

I also tried to write a very simple MR program which reads from a text file and writes the
output in 2 formats TextOutputFormat and SequenceFileOutputFormat as shown in the MultipleOutputs
api. However, no luck there as well. I get only 1 output file in text output format.

Can someone help me with this ?


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks
Visit http://barracudanetworks.com/facebook

View raw message