hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rosenstrauch <dar...@darose.net>
Subject Re: How to set SequenceFile.Metadata from within SequenceFileOutputFormat?
Date Tue, 10 Aug 2010 02:52:38 GMT
On 08/09/2010 09:14 PM, Harsh J wrote:
> Another solution would be to create a custom named output using
> mapred.lib.MultipleOutputs and collecting to that instead of the
> job-set output format (which one can set to NullOutputFormat so it
> doesn't complain about existing paths, etc.).
> So if you'd want 'foo' prefix to your 00000-NNNNN numbered output
> files (instead of default 'part'), you'd create it with
> MultipleOutputs.addNamedOutput(Conf, "foo", YourOutFormat.class,
> Key.class, Value.class);
> The extension, I believe, can be changed too, while 'getting' the path
> from the FileOutputFormat while building your RecordWriter. Something
> like:
> Path outPath = FileOutputFormat.getTaskOutputPath(job, name+YOUR_EXTENSION);
> // Now create the 'writer' on this path.

Tnx for the tip - didn't know about MultipleOutputs.  (Though it's 
probably overkill for what I'm doing.)

Thanks again,


View raw message