avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave McAlpin <dmcal...@inome.com>
Subject RE: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords
Date Wed, 19 Feb 2014 01:57:46 GMT
That's great Gary. Thanks for the follow up.

Dave

From: flaming.zelda@gmail.com [mailto:flaming.zelda@gmail.com] On Behalf Of Gary Steelman
Sent: Tuesday, February 18, 2014 5:15 PM
To: Gary Steelman
Cc: user@avro.apache.org
Subject: Re: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords

Hey all, I've adapted Dave's solution to serialize to/from byte[] rather than JSON. Thanks
a lot! The two methods are below:

  @SuppressWarnings("unchecked")
  public static <T> byte[] avroSerialize(Class<T> clazz, Object object) {
    byte[] ret = null;
    try {
      if (object == null || !(object instanceof SpecificRecord)) {
        return null;
      }

      T record = (T) object;
      ByteArrayOutputStream out = new ByteArrayOutputStream();
      Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
      SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
      w.write(record, e);
      e.flush();
      ret = out.toByteArray();
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }

  public static <T> T avroDeserialize(byte[] avroBytes, Class<T> clazz, Schema
schema) {
    T ret = null;
    try {
      ByteArrayInputStream in = new ByteArrayInputStream(avroBytes);
      Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
      SpecificDatumReader<T> reader = new SpecificDatumReader<T>(clazz);
      ret = reader.read(null, d);
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }
And they're called like so:
MyObject x = new MyObject();
byte[] avroBytes = avroSerialize(x.getClass(), x);
MyObject y = avroDeserialize(avroBytes, MyObject.class, MyObject.SCHEMA$);
Thanks,
Gary

On Tue, Feb 18, 2014 at 6:49 PM, Gary Steelman <gary.steelman42@gmail.com<mailto:gary.steelman42@gmail.com>>
wrote:

Thank you Dave, I appreciate it. I'll give those a shot and let you know how it goes.

-Gary
On Feb 18, 2014 6:45 PM, "Dave McAlpin" <dmcalpin@inome.com<mailto:dmcalpin@inome.com>>
wrote:
Here are some utility functions we've used for serialization to and from JSON. Something similar
should work for binary.

public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
    String avroEncodedJson = null;
    try {
        if (object == null || !(object instanceof SpecificRecord)) {
            return null;
        }
        T record = (T) object;
        Schema schema = ((SpecificRecord) record).getSchema();
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
        SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
        w.write(record, e);
        e.flush();
        avroEncodedJson = new String(out.toByteArray());
    } catch (IOException e) {
        e.printStackTrace();
    }

    return avroEncodedJson;
}

public <T> T jsonDecodeToAvro(String inputString, Class<T> className, Schema schema)
{
    T returnObject = null;
    try {
        JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema, inputString);
        SpecificDatumReader<T> reader = new SpecificDatumReader<T>(className);
        returnObject = reader.read(null, jsonDecoder);
    } catch (IOException e) {
        e.printStackTrace();
    }

    return returnObject;
}

Dave

From: flaming.zelda@gmail.com<mailto:flaming.zelda@gmail.com> [mailto:flaming.zelda@gmail.com<mailto:flaming.zelda@gmail.com>]
On Behalf Of Gary Steelman
Sent: Tuesday, February 18, 2014 4:21 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Subject: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords

Hi all,
Here's my use case: I've got a bunch of different Java objects generated from Avro schema
files. So the class definition headers look something like this: public class MyObject extends
org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord.
I've got many other types than MyObject too. I need to write a method which can serialize
(from MyObject or another class to byte[]) and deserialize (from byte[] to MyObject or another
class) in memory (not writing to disk).
I couldn't figure out how to write one method to handle it for SpecificRecord, so I tired
serializing/deserializing these things as GenericRecord instead:

  public static byte[] serializeFromAvro(GenericRecord gr) {
    try {
      DatumWriter<GenericRecord> writer2 = new GenericDatumWriter<GenericRecord>(gr.getSchema());
      ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
      BinaryEncoder encoder2 = EncoderFactory.get().directBinaryEncoder(bao2, null);
      writer2.write(gr, encoder2);
      byte[] avroBytes2 = bao2.toByteArray();
      return avroBytes2;
    } catch (IOException e) {
      LOG.debug(e);
      return null;
    }
  }
  // Here I use a DataType enum and the AvroSchemaFactory to quickly retrieve a Schema object
for a supported DataType.
  public static GenericRecord deserializeFromAvro(byte[] avroBytes, DataType dataType) {
    try {
      Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
      DatumReader<GenericRecord> reader2 = new GenericDatumReader<GenericRecord>(schema);
      ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
      BinaryDecoder decoder2 = DecoderFactory.get().directBinaryDecoder(bai2, null);
      GenericRecord gr2 = reader2.read(null, decoder2);
      return gr2;
    } catch (Exception e) {
      LOG.debug(e);
      return null;
    }
  }
And use them like such:
// Remember MyObject is the SpecificRecord implementing class.
MyObject x = new MyObject();
byte[] avroBytes = serializeFromAvro(x);
MyObject x2 = (MyObject) deserializeFromAvro(avroBytes, DataType.MyObject);
Which results in this:
java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to
datatypes.generated.avro.MyObject
Is there an easier way to achieve my use case, or some way I can fix my methods to allow the
sort of behavior I want?
Thanks,
Gary




Mime
View raw message