avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <b...@cloudera.com>
Subject Re: [DISCUSS][JAVA] Generating toBytes/fromBytes methods?
Date Mon, 21 Dec 2015 22:30:35 GMT
Niels,

This sounds like a good idea to me to have methods like this. I've had 
to write those methods several times!

The idea is also related to AVRO-1704 [1], which is a suggestion to 
standardize the encoding that is used for single records. Some projects 
have been embedding the schema fingerprint at the start of each record, 
for example, which would be a helpful thing to do.

It may also be a good idea to create a helper object rather than 
attaching new methods to the datum classes themselves. In your example 
below, you have to create a new encoder or decoder for each method call. 
We could instead keep a backing buffer and encoder/decoder on a class 
that the caller instantiates so that they can be reused. At the same 
time, that would make it possible to reuse the class with any data model 
and manage the available schemas (if embedding the fingerprint).

I'm thinking something like this:

   ReflectClass datum = new ReflectClass();
   ReflectData model = ReflectData.get();
   DatumCodec codec = new DatumCodec(model, schema);

   # convert datum to bytes using data model
   byte[] asBytes = codec.toBytes(datum);

   # convert bytes to datum using data model
   ReflectClass copy = codec.fromBytes(asBytes);

What do you think?

rb


[1]: https://issues.apache.org/jira/browse/AVRO-1704

On 12/18/2015 05:01 AM, Niels Basjes wrote:
> Hi,
>
> I'm working on a project where I'm putting Avro records into Kafka and at
> the other end pull them out again.
> For that purpose I wrote two methods 'toBytes' and 'fromBytes' in a
> separate class (see below).
>
> I see this as the type of problem many developers run into.
> Would it be a good idea to generate methods like these into the generated
> Java code?
>
> This would make it possible to serialize and deserialize singles records
> like this:
>
> byte [] someBytes = measurement.toBytes();
> Measurement m = Measurement.fromBytes(someBytes);
>
> Niels Basjes
>
> P.S. possibly not name it toBytes but getBytes (similar to what the String
> class has)
>
> public final class MeasurementSerializer {
>      private MeasurementSerializer() {
>      }
>
>      public static Measurement fromBytes(byte[] bytes) throws IOException {
>          try {
>              DatumReader<Measurement> reader = new
> SpecificDatumReader<>(Measurement.getClassSchema());
>              Decoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
>              return reader.read(null, decoder);
>          } catch (RuntimeException rex) {
>              throw new IOException(rex.getMessage());
>          }
>      }
>
>      public static byte[] toBytes(Measurement measurement) throws IOException {
>          try {
>              ByteArrayOutputStream out = new ByteArrayOutputStream();
>              Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
>              SpecificDatumWriter<Measurement> writer = new
> SpecificDatumWriter<>(Measurement.class);
>              writer.write(measurement, encoder);
>              encoder.flush();
>              out.close();
>              return out.toByteArray();
>          } catch (RuntimeException rex) {
>              throw new IOException(rex.getMessage());
>          }
>      }
> }
>
>
>


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.

Mime
View raw message