From Dave McAlpin <dmcal...@inome.com>
Subject RE: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords
Date Wed, 19 Feb 2014 00:44:42 GMT
Here are some utility functions we've used for serialization to and from JSON. Something similar
should work for binary.

public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
    String avroEncodedJson = null;
    try {
        if (object == null || !(object instanceof SpecificRecord)) {
            return null;
        T record = (T) object;
        Schema schema = ((SpecificRecord) record).getSchema();
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
        SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
        w.write(record, e);
        avroEncodedJson = new String(out.toByteArray());
    } catch (IOException e) {

    return avroEncodedJson;

public <T> T jsonDecodeToAvro(String inputString, Class<T> className, Schema schema)
    T returnObject = null;
    try {
        JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema, inputString);
        SpecificDatumReader<T> reader = new SpecificDatumReader<T>(className);
        returnObject = reader.read(null, jsonDecoder);
    } catch (IOException e) {

    return returnObject;


From: flaming.zelda@gmail.com [mailto:flaming.zelda@gmail.com] On Behalf Of Gary Steelman
Sent: Tuesday, February 18, 2014 4:21 PM
To: user@avro.apache.org
Subject: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords

Hi all,
Here's my use case: I've got a bunch of different Java objects generated from Avro schema
files. So the class definition headers look something like this: public class MyObject extends
org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord.
I've got many other types than MyObject too. I need to write a method which can serialize
(from MyObject or another class to byte[]) and deserialize (from byte[] to MyObject or another
class) in memory (not writing to disk).
I couldn't figure out how to write one method to handle it for SpecificRecord, so I tired
serializing/deserializing these things as GenericRecord instead:

  public static byte[] serializeFromAvro(GenericRecord gr) {
    try {
      DatumWriter<GenericRecord> writer2 = new GenericDatumWriter<GenericRecord>(gr.getSchema());
      ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
      BinaryEncoder encoder2 = EncoderFactory.get().directBinaryEncoder(bao2, null);
      writer2.write(gr, encoder2);
      byte[] avroBytes2 = bao2.toByteArray();
      return avroBytes2;
    } catch (IOException e) {
      return null;
  // Here I use a DataType enum and the AvroSchemaFactory to quickly retrieve a Schema object
for a supported DataType.
  public static GenericRecord deserializeFromAvro(byte[] avroBytes, DataType dataType) {
    try {
      Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
      DatumReader<GenericRecord> reader2 = new GenericDatumReader<GenericRecord>(schema);
      ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
      BinaryDecoder decoder2 = DecoderFactory.get().directBinaryDecoder(bai2, null);
      GenericRecord gr2 = reader2.read(null, decoder2);
      return gr2;
    } catch (Exception e) {
      return null;
And use them like such:
// Remember MyObject is the SpecificRecord implementing class.
MyObject x = new MyObject();
byte[] avroBytes = serializeFromAvro(x);
MyObject x2 = (MyObject) deserializeFromAvro(avroBytes, DataType.MyObject);
Which results in this:
java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to
Is there an easier way to achieve my use case, or some way I can fix my methods to allow the
sort of behavior I want?

