hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nishanth S <chinchu2...@gmail.com>
Subject Fwd: Avro MultipleOutputs in Mapreduce
Date Thu, 25 Jun 2015 17:51:31 GMT
Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan

Mime
View raw message