hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Jirapinyo <kris.jirapi...@biz360.com>
Subject Re: Two output files?
Date Fri, 14 Aug 2009 20:23:52 GMT
Hi John,
     If you have the Hadoop O'Reilly book, look at pg 206 for an example.
But basically, you just create a subclass of MultipleTextOutputFormat and
then inside it you override generateFileNameForKeyValue (for example) to
have the reducer emit the desired filenames.  For each key in the reducer,
it will write the text values to that file.  Make sure in the JobConf you
set OutputFormat to your class that extends MultipleTextOutputFormat.

-- Kris.

On Fri, Aug 14, 2009 at 7:11 AM, John Clarke <clarkemjj@gmail.com> wrote:

> Hi,
> I want to output two text files from my MapReduce job but I am having
> trouble understanding how to use the MultipleTextOutputFormat class to do
> so.
> I want to write to the two files depending on the key of each key/value
> pair.
> In the Reducer how do I tell it to write the different files? Normally I
> just do an output.collect(key, val);.
> Any help would be most appreciated.
> Thanks,
> John

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message