Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E8979102BC for ; Wed, 19 Feb 2014 00:45:15 +0000 (UTC) Received: (qmail 35431 invoked by uid 500); 19 Feb 2014 00:45:14 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 35195 invoked by uid 500); 19 Feb 2014 00:45:14 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 35186 invoked by uid 99); 19 Feb 2014 00:45:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 00:45:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dmcalpin@inome.com designates 207.211.31.81 as permitted sender) Received: from [207.211.31.81] (HELO us-smtp-1.mimecast.com) (207.211.31.81) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 00:45:07 +0000 Received: from mx.intelius.com (65.124.55.10 [65.124.55.10]) by us-mta-1.us.mimecast.lan; Tue, 18 Feb 2014 19:44:45 -0500 From: Dave McAlpin To: "user@avro.apache.org" Subject: RE: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords Thread-Topic: General-Purpose Serialization and Deserialization for Avro-Generated SpecificRecords Thread-Index: AQHPLQmXd2Tx1hKhY0u/RwjLwJ1ripq7vN/g Date: Wed, 19 Feb 2014 00:44:42 +0000 Message-ID: <97CDF5378BDCEA4FB609A50F0FD3599ACCBC947A@TUK2-EMSMBX3.intelius1.intelius.com> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: MIME-Version: 1.0 X-MC-Unique: lkzxEWMBRcGKu55459crPQ-1 Content-Type: multipart/alternative; boundary="_000_97CDF5378BDCEA4FB609A50F0FD3599ACCBC947ATUK2EMSMBX3inte_" X-Virus-Checked: Checked by ClamAV on apache.org --_000_97CDF5378BDCEA4FB609A50F0FD3599ACCBC947ATUK2EMSMBX3inte_ Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Here are some utility functions we've used for serialization to and from JS= ON. Something similar should work for binary. public String avroEncodeAsJson(Class clazz, Object object) { String avroEncodedJson =3D null; try { if (object =3D=3D null || !(object instanceof SpecificRecord)) { return null; } T record =3D (T) object; Schema schema =3D ((SpecificRecord) record).getSchema(); ByteArrayOutputStream out =3D new ByteArrayOutputStream(); Encoder e =3D EncoderFactory.get().jsonEncoder(schema, out); SpecificDatumWriter w =3D new SpecificDatumWriter(clazz); w.write(record, e); e.flush(); avroEncodedJson =3D new String(out.toByteArray()); } catch (IOException e) { e.printStackTrace(); } return avroEncodedJson; } public T jsonDecodeToAvro(String inputString, Class className, Schem= a schema) { T returnObject =3D null; try { JsonDecoder jsonDecoder =3D DecoderFactory.get().jsonDecoder(schema= , inputString); SpecificDatumReader reader =3D new SpecificDatumReader(classN= ame); returnObject =3D reader.read(null, jsonDecoder); } catch (IOException e) { e.printStackTrace(); } return returnObject; } Dave From: flaming.zelda@gmail.com [mailto:flaming.zelda@gmail.com] On Behalf Of= Gary Steelman Sent: Tuesday, February 18, 2014 4:21 PM To: user@avro.apache.org Subject: General-Purpose Serialization and Deserialization for Avro-Generat= ed SpecificRecords Hi all, Here's my use case: I've got a bunch of different Java objects generated fr= om Avro schema files. So the class definition headers look something like t= his: public class MyObject extends org.apache.avro.specific.SpecificRecordB= ase implements org.apache.avro.specific.SpecificRecord. I've got many other= types than MyObject too. I need to write a method which can serialize (fro= m MyObject or another class to byte[]) and deserialize (from byte[] to MyOb= ject or another class) in memory (not writing to disk). I couldn't figure out how to write one method to handle it for SpecificReco= rd, so I tired serializing/deserializing these things as GenericRecord inst= ead: public static byte[] serializeFromAvro(GenericRecord gr) { try { DatumWriter writer2 =3D new GenericDatumWriter(gr.getSchema()); ByteArrayOutputStream bao2 =3D new ByteArrayOutputStream(); BinaryEncoder encoder2 =3D EncoderFactory.get().directBinaryEncoder(b= ao2, null); writer2.write(gr, encoder2); byte[] avroBytes2 =3D bao2.toByteArray(); return avroBytes2; } catch (IOException e) { LOG.debug(e); return null; } } // Here I use a DataType enum and the AvroSchemaFactory to quickly retrie= ve a Schema object for a supported DataType. public static GenericRecord deserializeFromAvro(byte[] avroBytes, DataTyp= e dataType) { try { Schema schema =3D AvroSchemaFactory.getInstance().getSchema(dataType)= ; DatumReader reader2 =3D new GenericDatumReader(schema); ByteArrayInputStream bai2 =3D new ByteArrayInputStream(avroBytes); BinaryDecoder decoder2 =3D DecoderFactory.get().directBinaryDecoder(b= ai2, null); GenericRecord gr2 =3D reader2.read(null, decoder2); return gr2; } catch (Exception e) { LOG.debug(e); return null; } } And use them like such: // Remember MyObject is the SpecificRecord implementing class. MyObject x =3D new MyObject(); byte[] avroBytes =3D serializeFromAvro(x); MyObject x2 =3D (MyObject) deserializeFromAvro(avroBytes, DataType.MyObject= ); Which results in this: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record ca= nnot be cast to datatypes.generated.avro.MyObject Is there an easier way to achieve my use case, or some way I can fix my met= hods to allow the sort of behavior I want? Thanks, Gary --_000_97CDF5378BDCEA4FB609A50F0FD3599ACCBC947ATUK2EMSMBX3inte_ Content-Type: text/html; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable

Here are some utility fun= ctions we’ve used for serialization to and from JSON. Something simil= ar should work for binary.

 <= /p>

public <T> String a= vroEncodeAsJson(Class<T> clazz, Object object) { 

    S= tring avroEncodedJson =3D null; 

    t= ry { 

    &= nbsp;   if (object =3D=3D null || !(object instanceof Specif= icRecord)) { 

    &= nbsp;       return null; 

    &= nbsp;   } 

    &= nbsp;   T record =3D (T) object; 

    &= nbsp;   Schema schema =3D ((SpecificRecord) record).getSchem= a(); 

    &= nbsp;   ByteArrayOutputStream out =3D new ByteArrayOutputStr= eam(); 

    &= nbsp;   Encoder e =3D EncoderFactory.get().jsonEncoder(schem= a, out); 

    &= nbsp;   SpecificDatumWriter<T> w =3D new SpecificDatum= Writer<T>(clazz); 

    &= nbsp;   w.write(record, e); 

    &= nbsp;   e.flush(); 

    &= nbsp;   avroEncodedJson =3D new String(out.toByteArray());&n= bsp;

    }= catch (IOException e) { 

    &= nbsp;   e.printStackTrace();

    } = ;

 <= /p>

    return= avroEncodedJson; 

}

 <= /p>

public <T> T jsonDe= codeToAvro(String inputString, Class<T> className, Schema schema) {&n= bsp;

    T= returnObject =3D null; 

    t= ry { 

    &= nbsp;   JsonDecoder jsonDecoder =3D DecoderFactory.get().jso= nDecoder(schema, inputString); 

    &= nbsp;   SpecificDatumReader<T> reader =3D new Specific= DatumReader<T>(className); 

    &= nbsp;   returnObject =3D reader.read(null, jsonDecoder);&nbs= p;

    }= catch (IOException e) { 

    &= nbsp;   e.printStackTrace();

    } = ;

    &= nbsp; 

    r= eturn returnObject; 

}

 <= /p>

Dave

 <= /p>

From: flaming.= zelda@gmail.com [mailto:flaming.zelda@gmail.com] On Behalf Of Gary Steelman
Sent: Tuesday, February 18, 2014 4:21 PM
To: user@avro.apache.org
Subject: General-Purpose Serialization and Deserialization for Avro-= Generated SpecificRecords

 

Hi all,

Here's my use case: I= 've got a bunch of different Java objects generated from Avro schema files.= So the class definition headers look something like this: public class MyO= bject extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord. I've got many other ty= pes than MyObject too. I need to write a method which can serialize (from M= yObject or another class to byte[]) and deserialize (from byte[] to MyObjec= t or another class) in memory (not writing to disk).

I couldn't figure out= how to write one method to handle it for SpecificRecord, so I tired serial= izing/deserializing these things as GenericRecord instead:

  public static byte[] serializeFromAvro(GenericRecord gr) {
    try {
      DatumWriter<GenericRecord> writer2 =3D= new GenericDatumWriter<GenericRecord>(gr.getSchema());
      ByteArrayOutputStream bao2 =3D new ByteArray= OutputStream();
      BinaryEncoder encoder2 =3D EncoderFactory.ge= t().directBinaryEncoder(bao2, null);
      writer2.write(gr, encoder2);
      byte[] avroBytes2 =3D bao2.toByteArray();       return avroBytes2;
    } catch (IOException e) {
      LOG.debug(e);
      return null;
    }
  }

  // Here I use a DataType enum and the AvroSch= emaFactory to quickly retrieve a Schema object for a supported DataType.

  public static = GenericRecord deserializeFromAvro(byte[] avroBytes, DataType dataType) {     try {
      Schema schema =3D AvroSchemaFactory.getInsta= nce().getSchema(dataType);
      DatumReader<GenericRecord> reader2 =3D= new GenericDatumReader<GenericRecord>(schema);
      ByteArrayInputStream bai2 =3D new ByteArrayI= nputStream(avroBytes);
      BinaryDecoder decoder2 =3D DecoderFactory.ge= t().directBinaryDecoder(bai2, null);
      GenericRecord gr2 =3D reader2.read(null, dec= oder2);
      return gr2;
    } catch (Exception e) {
      LOG.debug(e);
      return null;
    }
  }

And use them like suc= h:

// Remember MyObject is the SpecificRecord implement= ing class.

MyObject x =3D new MyObject();

byte[] avroBytes =3D serializeFromAvro(x);

MyObject x2 =3D (MyOb= ject) deserializeFromAvro(avroBytes, DataType.MyObject);

Which results in this= :
java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record ca= nnot be cast to datatypes.generated.avro.MyObject

Is there an easier wa= y to achieve my use case, or some way I can fix my methods to allow the sor= t of behavior I want?

Thanks,

Gary

--_000_97CDF5378BDCEA4FB609A50F0FD3599ACCBC947ATUK2EMSMBX3inte_--