crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Ortiz <dor...@videologygroup.com>
Subject RE: How to write a generic transform method that will act upon generated avro objects in a generic fashion
Date Mon, 22 Jun 2015 20:23:09 GMT
When you actually write the code will you know what the avro record is?  I’ve been able to
do something along the lines of

public class GenericAvroFunction<T extends SpecificRecordBase> extends DoFn<T, String>
{
…

public void process(T input, Emitter<String> emitter) {
…
}
}

then parameterizing it in the various pipelines that use it.  Not sure with regards to making
it work at run time though.

From: Sankash Shankar [mailto:sankash@wealthfront.com]
Sent: Monday, June 22, 2015 4:18 PM
To: user@crunch.apache.org
Subject: How to write a generic transform method that will act upon generated avro objects
in a generic fashion

Hello.

I am writing a Crunch job that takes in an arbitrary class that extends SpecificRecord and
performs a transformation on the fields in the class. I am attempting to write a parallelDo
function on these classes, but

public static PCollection<String> function(PCollection<? extends SpecificRecord>
coll) {
  coll.parallelDo(new DoFn<? extends SpecificRecord, String>() {
    ...
  }, Avros.strings());
}

will not compile given it expects a type at compile-time

will not compile given it expects a type at compile time, while

public static PCollection<String> transformAvroToCsv(PCollection<SpecificRecord>
coll) {
  coll.parallelDo(new DoFn<SpecificRecord, String>() {
    @Override
    public void process(SpecificRecord input, Emitter<String> emitter) {
    }
  }, Avros.strings());
  return null;
}

will fail at run-time due to SpecificRecord not having an init constructor.
What is the standard way for taking in generic avro records and having a generic
transform method to call on them?

Thanks.
This email is intended only for the use of the individual(s) to whom it is addressed. If you
have received this communication in error, please immediately notify the sender and delete
the original email.
Mime
View raw message