avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Steelman <gary.steelma...@gmail.com>
Subject Re: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords
Date Wed, 19 Feb 2014 01:14:44 GMT
Hey all, I've adapted Dave's solution to serialize to/from byte[] rather
than JSON. Thanks a lot! The two methods are below:

  @SuppressWarnings("unchecked")
  public static <T> byte[] avroSerialize(Class<T> clazz, Object object) {
    byte[] ret = null;
    try {
      if (object == null || !(object instanceof SpecificRecord)) {
        return null;
      }

      T record = (T) object;
      ByteArrayOutputStream out = new ByteArrayOutputStream();
      Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
      SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
      w.write(record, e);
      e.flush();
      ret = out.toByteArray();
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }

  public static <T> T avroDeserialize(byte[] avroBytes, Class<T> clazz,
Schema schema) {
    T ret = null;
    try {
      ByteArrayInputStream in = new ByteArrayInputStream(avroBytes);
      Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
      SpecificDatumReader<T> reader = new SpecificDatumReader<T>(clazz);
      ret = reader.read(null, d);
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }

And they're called like so:
MyObject x = new MyObject();
byte[] avroBytes = avroSerialize(x.getClass(), x);
MyObject y = avroDeserialize(avroBytes, MyObject.class, MyObject.SCHEMA$);

Thanks,
Gary


On Tue, Feb 18, 2014 at 6:49 PM, Gary Steelman <gary.steelman42@gmail.com>wrote:

> Thank you Dave, I appreciate it. I'll give those a shot and let you know
> how it goes.
>
> -Gary
> On Feb 18, 2014 6:45 PM, "Dave McAlpin" <dmcalpin@inome.com> wrote:
>
>>  Here are some utility functions we've used for serialization to and
>> from JSON. Something similar should work for binary.
>>
>>
>>
>> public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
>>
>>     String avroEncodedJson = null;
>>
>>     try {
>>
>>         if (object == null || !(object instanceof SpecificRecord)) {
>>
>>             return null;
>>
>>         }
>>
>>         T record = (T) object;
>>
>>         Schema schema = ((SpecificRecord) record).getSchema();
>>
>>         ByteArrayOutputStream out = new ByteArrayOutputStream();
>>
>>         Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
>>
>>         SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
>>
>>         w.write(record, e);
>>
>>         e.flush();
>>
>>         avroEncodedJson = new String(out.toByteArray());
>>
>>     } catch (IOException e) {
>>
>>         e.printStackTrace();
>>
>>     }
>>
>>
>>
>>     return avroEncodedJson;
>>
>> }
>>
>>
>>
>> public <T> T jsonDecodeToAvro(String inputString, Class<T> className,
>> Schema schema) {
>>
>>     T returnObject = null;
>>
>>     try {
>>
>>         JsonDecoder jsonDecoder =
>> DecoderFactory.get().jsonDecoder(schema, inputString);
>>
>>         SpecificDatumReader<T> reader = new
>> SpecificDatumReader<T>(className);
>>
>>         returnObject = reader.read(null, jsonDecoder);
>>
>>     } catch (IOException e) {
>>
>>         e.printStackTrace();
>>
>>     }
>>
>>
>>
>>     return returnObject;
>>
>> }
>>
>>
>>
>> Dave
>>
>>
>>
>> *From:* flaming.zelda@gmail.com [mailto:flaming.zelda@gmail.com] *On
>> Behalf Of *Gary Steelman
>> *Sent:* Tuesday, February 18, 2014 4:21 PM
>> *To:* user@avro.apache.org
>> *Subject:* General-Purpose Serialization and Deserialization for
>> Avro-Generated SpecificRecords
>>
>>
>>
>> Hi all,
>>
>> Here's my use case: I've got a bunch of different Java objects generated
>> from Avro schema files. So the class definition headers look something like
>> this: public class MyObject extends
>> org.apache.avro.specific.SpecificRecordBase implements
>> org.apache.avro.specific.SpecificRecord. I've got many other types than
>> MyObject too. I need to write a method which can serialize (from MyObject
>> or another class to byte[]) and deserialize (from byte[] to MyObject or
>> another class) in memory (not writing to disk).
>>
>> I couldn't figure out how to write one method to handle it for
>> SpecificRecord, so I tired serializing/deserializing these things as
>> GenericRecord instead:
>>
>>   public static byte[] serializeFromAvro(GenericRecord gr) {
>>     try {
>>       DatumWriter<GenericRecord> writer2 = new
>> GenericDatumWriter<GenericRecord>(gr.getSchema());
>>       ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
>>       BinaryEncoder encoder2 =
>> EncoderFactory.get().directBinaryEncoder(bao2, null);
>>       writer2.write(gr, encoder2);
>>       byte[] avroBytes2 = bao2.toByteArray();
>>       return avroBytes2;
>>     } catch (IOException e) {
>>       LOG.debug(e);
>>       return null;
>>     }
>>   }
>>
>>   // Here I use a DataType enum and the AvroSchemaFactory to quickly
>> retrieve a Schema object for a supported DataType.
>>
>>   public static GenericRecord deserializeFromAvro(byte[] avroBytes,
>> DataType dataType) {
>>     try {
>>       Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
>>       DatumReader<GenericRecord> reader2 = new
>> GenericDatumReader<GenericRecord>(schema);
>>       ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
>>       BinaryDecoder decoder2 =
>> DecoderFactory.get().directBinaryDecoder(bai2, null);
>>       GenericRecord gr2 = reader2.read(null, decoder2);
>>       return gr2;
>>     } catch (Exception e) {
>>       LOG.debug(e);
>>       return null;
>>     }
>>   }
>>
>> And use them like such:
>>
>> // Remember MyObject is the SpecificRecord implementing class.
>>
>> MyObject x = new MyObject();
>>
>> byte[] avroBytes = serializeFromAvro(x);
>>
>> MyObject x2 = (MyObject) deserializeFromAvro(avroBytes,
>> DataType.MyObject);
>>
>> Which results in this:
>> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record
>> cannot be cast to datatypes.generated.avro.MyObject
>>
>> Is there an easier way to achieve my use case, or some way I can fix my
>> methods to allow the sort of behavior I want?
>>
>> Thanks,
>>
>> Gary
>>
>

Mime
View raw message