hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nishanth S <chinchu2...@gmail.com>
Subject Fwd: Avro Map Reduce for Multiple Schemas
Date Mon, 20 Jul 2015 22:56:00 GMT
Hello,

I have to output multiple avro  files with different schemas as the  output
of a  mapreduce  job.Currently I am achieving this by doing a union of all
the schemas in the driver and then  by using Avromultipleoutputs  to
 output two files.


AvroMultipleOutputs.addNamedOutput(job, "a",
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),A.getClassSchema());
        AvroMultipleOutputs.addNamedOutput(job, "b",
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),B.getClassSchema());
List<Schema> schemas = new ArrayList<Schema>();
schemas.add(C.getClassSchema());
schemas.add(D.getClassSchema());
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),A.getClassSchema());
        AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));

AvroJob.setOutputValueSchema(job,B.getClassSchema().createUnion(schemas) );

Is there a better way to do this?.Request help.

Thanks,
Nishan

Mime
View raw message