hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew John <tmatthewjohn1...@gmail.com>
Subject Re: Could I write outputs in multiple directories?
Date Mon, 14 Feb 2011 07:38:41 GMT
Hi Junyoung Kim,

You can try out MultipleOutputs.addNamedOutput() . The second
parameter u pass in is supposed to be the filename to be which you are
writing the reducer output. Therefore if your output folder is X
(using setOutputPath() ), you can try giving "A/output", "B/output",
"C/output" in the 2nd parameter space. It should write the
corresponding data to X/A/output , X/B/output and X/C/output
respectively I guess.

In the reducer, depending on the key , you can use getCollector() to
write it to different output paths.
For eg:
if (Key == A)


On Mon, Feb 14, 2011 at 11:27 AM, Jun Young Kim <juneng603@gmail.com> wrote:
> Hi,
> As I understand, a Hadoop can write multiple files in a directory.
> but, it can't write output files in multiple directories. isn't it?
> MultipleOutputs for generating multiple files.
> FileInputFormat.addInputPaths for setting several input files simultaneously.
> How could I do if I want to write outputs files in multiple directories depends on it's
> for example)
> A type key -> yyyyMMdd/A/output
> B type Key -> yyyyMMdd/B/output
> C type Key -> yyyyMMdd/C/output
> thanks.
> --
> Junyoung Kim (juneng603@gmail.com)

View raw message