hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu, Cheng A" <cheng.a...@intel.com>
Subject RE: Possible to create ORC file with specific name in reducer?
Date Wed, 29 Oct 2014 01:21:16 GMT
You can refer to the unit test for TestNewInputOutputFormat

    FileOutputFormat.setOutputPath(job, outputPath);

From: Robert Towne [mailto:Robert.Towne@WebTrends.com]
Sent: Wednesday, October 29, 2014 6:35 AM
To: user@hive.apache.org
Subject: Possible to create ORC file with specific name in reducer?

I want to write to a new ORC file from a map/reduce job outside of Hive.

I'd like to setup a job without using the format like this:
job.setOutputFormatClass(OrcNewOutputFormat.class);

But instead specify:
job.setOutputFormatClass(NullOutputFormat.class);

And in the reducer, specify the filename, iterate through the results, and close the file
at the end.



Something akin to:

    @Override
    public void reduce(final Text key, final Iterable<MyClass> myClasses, final Context
context)  {
RecordWriter orcWriter = new OrcNewOutputFormat().getRecordWriter(context);
for (MyClass myClass : myClasses) {
orcWriter = new OrcNewOutputFormat().getRecordWriter(context);
}
orcWriter.close(context);

But there is no public constructor for OrcNewOutputFormat to specify the filename I'd like.

Am I missing something, or does anyone know how to specify the filename?

Thank you,
Robert Towne

Mime
View raw message