avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Bockrath-Vandegrift <llas...@damballa.com>
Subject Re: Different outputformats in avro map reduce job
Date Thu, 09 Jul 2015 14:23:34 GMT
Nishanth S <chinchu2884@gmail.com> writes:

> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output
> multiplle avro files using (Avromultipleouts).How would I modify the
> job to output textformat as well along with these avro files.Is it
> possible.

I’m not aware of a general solution to this problem in raw Java
MapReduce.  But in Parkour (Clojure MapReduce wrapper) I’ve implemented
a “de-multiplexing” output format which ties multiple outputs to
arbitrary isolated job sub-configurations, allowing each output to
specify a separate output format and any output format configuration.
This both solves your problem and avoids the need for special purpose
multiple-output classes like Avro’s.

It should be fairly straightforward to implement the same thing in Java,
or if you’re feeling adventurous it should be possible to use the
Parkour de-multiplexing output configuration from a Java job.

-- 
Marshall Bockrath-Vandegrift <llasram@damballa.com>
Principal Software Engineer, Damballa R&D


Mime
View raw message