avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksey Maslov <Aleksey.Mas...@Lab49.com>
Subject How to direct Reducer to write avro objects to avro sequence file?
Date Fri, 11 Mar 2011 05:54:37 GMT
Hi,
(using hadoop 0.20.2 and avro 1.4.1)

I have defined a simple avro object 'AvroObj' (a record of strings),
compiled the schema and 
setup a simple MR job that takes as input &lt;Object, Text&gt; and emits
&lt;Text, IntWritable&gt;
and reducer that takes said &lt;Text, IntWritable&gt; and ...
I would like to achieve is - have reducer emit &lt;NullWritable, AvroObj&gt;
pairs into an avro sequence file;

so the next mr job will open that avro file and read-in avro objects, not
text lines, out of it;

I have looked through the (H ed.2) book and few online samples but can't
figure out how to do it;
some online sources mention job config settings like:
        job.setOutputFormatClass(AvroOutputFormat.class);        
        AvroOutputFormat.setCompressOutput(conf, false);

But this doesn't compile - setCompressOutput asks for deprecated JobConf
object, and
"setOutputFormatClass" gives error about its param - param not applicable to
AvroOutputFormat.class;

Could someone enlighten me how to have reducer write to avro sequence file ?

Cheers;

--
View this message in context: http://apache-avro.679487.n3.nabble.com/How-to-direct-Reducer-to-write-avro-objects-to-avro-sequence-file-tp2663706p2663706.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Mime
View raw message