crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sankash Shankar <sank...@wealthfront.com>
Subject Re: How to write a generic transform method that will act upon generated avro objects in a generic fashion
Date Mon, 22 Jun 2015 22:09:16 GMT
Hello,

With regards to your question, we will know the class will be one of a
pre-defined list of classes, but the exact class will not be known until
runtime. In addition, the generic class GenericAvroFunction cannot be
defined in a static manner and a generic type, which keeps it from being
serializable.

Thanks.



On Mon, Jun 22, 2015 at 1:23 PM, David Ortiz <dortiz@videologygroup.com>
wrote:

>  When you actually write the code will you know what the avro record is?
> I’ve been able to do something along the lines of
>
>
>
> public class GenericAvroFunction<T extends SpecificRecordBase> extends
> DoFn<T, String> {
>
> …
>
>
>
> public void process(T input, Emitter<String> emitter) {
>
> …
>
> }
>
> }
>
>
>
> then parameterizing it in the various pipelines that use it.  Not sure
> with regards to making it work at run time though.
>
>
>
> *From:* Sankash Shankar [mailto:sankash@wealthfront.com]
> *Sent:* Monday, June 22, 2015 4:18 PM
> *To:* user@crunch.apache.org
> *Subject:* How to write a generic transform method that will act upon
> generated avro objects in a generic fashion
>
>
>
> Hello.
>
>
>
> I am writing a Crunch job that takes in an arbitrary class that extends
> SpecificRecord and performs a transformation on the fields in the class. I
> am attempting to write a parallelDo function on these classes, but
>
> *public static *PCollection<String> function(PCollection<? *extends *SpecificRecord>
coll) {
>   coll.parallelDo(*new *DoFn<? *extends *SpecificRecord, String>() {
>     ...
>   }, Avros.*strings*());
> }
>
> will not compile given it expects a type at compile-time
>
>  *will not compile given it expects a type at compile time, while *
>
>  *public static *PCollection<String> transformAvroToCsv(PCollection<SpecificRecord>
coll) {
>   coll.parallelDo(*new *DoFn<SpecificRecord, String>() {
>     @Override
>     *public void *process(SpecificRecord input, Emitter<String> emitter) {
>     }
>   }, Avros.*strings*());
>   *return null*;
> }
>
>  *will fail at run-time due to SpecificRecord not having an init constructor.*
>
>   What is the standard way for taking in generic avro records and having
> a generic
>
> transform method to call on them?
>
>
>
> Thanks.
>    *This email is intended only for the use of the individual(s) to whom
> it is addressed. If you have received this communication in error, please
> immediately notify the sender and delete the original email.*
>

Mime
View raw message