avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-295) JsonEncoder is not flushed after writing using ReflectDatumWriter
Date Fri, 08 Jan 2010 00:34:54 GMT

     [ https://issues.apache.org/jira/browse/AVRO-295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Hsieh updated AVRO-295:
--------------------------------

    Description: 
JsonEncoder needs to be flushed otherwise data may be left in its buffers.  Ideally behavior
should be the same regardless of what kind of Encoder passed in. Here is some example code:


{code}
class  A { 
  long timestamp;
}

  public void testEventSchemaSerializeBinary() throws IOException {
    A e = new A();
    e.timestamp = 1234;
    ReflectData reflectData = ReflectData.get();
    Schema schm = reflectData.getSchema(A.class);
    System.out.println(schm);

    ReflectDatumWriter writer = new ReflectDatumWriter(schm);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    Encoder json = new BinaryEncoder(out);
    writer.write(e, json); // only one calls

    byte[] bs = out.toByteArray();
    int len = bs.length; // length is 2, which is reasonable.
    System.out.println("output size: " + len);
  }


public void testSerializeJson() throws IOException {
    A a = new A();
    a.timestamp = 1234;
    ReflectData reflectData = ReflectData.get();
    Schema schm = reflectData.getSchema(A.class);
    ReflectDatumWriter writer = new ReflectDatumWriter(schm);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    JsonEncoder json = new JsonEncoder(schm, out);
    writer.write(e, json); /// only one call

    // did not flush
    byte[] bs = out.toByteArray();
    int len = bs.length; // len == 0;  this is unexpected!
    System.out.println("output size: " + len); 
 
    // flushed this time. this is a bit unwieldy
    json.flush(); 
    bs = out.toByteArray();
    len = bs.length; // len == 18; this is better!
    System.out.println("output size: " + len);
}

{code}

One way to deal with it is to  have either all Encoders have flush method so the DatumWriter
can always flush it, and potentially add a flush method to DatumWriter as well. 

  was:
JsonEncoder needs to be flushed otherwise data may be left in its buffers.  Ideally behavior
should be the same regardless of what kind of Encoder passed in. Here is some example code:


{code}
class  A { 
  long timestamp;
}

  public void testEventSchemaSerializeBinary() throws IOException {
    A e = new A();
    e.timestamp = 1234;
    ReflectData reflectData = ReflectData.get();
    Schema schm = reflectData.getSchema(A.class);
    System.out.println(schm);

    ReflectDatumWriter writer = new ReflectDatumWriter(schm);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    Encoder json = new BinaryEncoder(out);
    writer.write(e, json); // only one calls

    byte[] bs = out.toByteArray();
    int len = bs.length; // length is 2, which is reasonable.
    System.out.println("output size: " + len);
  }


public void testSerializeJson() throws IOException {
    A a = new A();
    a.timestamp = 1234;
    ReflectData reflectData = ReflectData.get();
    Schema schm = reflectData.getSchema(A.class);
    ReflectDatumWriter writer = new ReflectDatumWriter(schm);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    JsonEncoder json = new JsonEncoder(schm, out);
    writer.write(e, json); /// only one call

    // did not flush
    byte[] bs = out.toByteArray();
    int len = bs.length; // len == 0;  this is unexpected!
    System.out.println("output size: " + len); 
 
    // flushed this time. this is a bit unwieldy
    json.flush(); 
    bs = out.toByteArray();
    len = bs.length; // len == 18; this is better!
    System.out.println("output size: " + len);
}

{code}

One way to deal with it is to  have either all Encoders have flush method (so the DatumWriter
can always flush it, and potentially add a flush method to DatumWriter as well. 


> JsonEncoder  is not flushed after writing using ReflectDatumWriter
> ------------------------------------------------------------------
>
>                 Key: AVRO-295
>                 URL: https://issues.apache.org/jira/browse/AVRO-295
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>    Affects Versions: 1.3.0
>            Reporter: Jonathan Hsieh
>
> JsonEncoder needs to be flushed otherwise data may be left in its buffers.  Ideally behavior
should be the same regardless of what kind of Encoder passed in. Here is some example code:

> {code}
> class  A { 
>   long timestamp;
> }
>   public void testEventSchemaSerializeBinary() throws IOException {
>     A e = new A();
>     e.timestamp = 1234;
>     ReflectData reflectData = ReflectData.get();
>     Schema schm = reflectData.getSchema(A.class);
>     System.out.println(schm);
>     ReflectDatumWriter writer = new ReflectDatumWriter(schm);
>     ByteArrayOutputStream out = new ByteArrayOutputStream();
>     Encoder json = new BinaryEncoder(out);
>     writer.write(e, json); // only one calls
>     byte[] bs = out.toByteArray();
>     int len = bs.length; // length is 2, which is reasonable.
>     System.out.println("output size: " + len);
>   }
> public void testSerializeJson() throws IOException {
>     A a = new A();
>     a.timestamp = 1234;
>     ReflectData reflectData = ReflectData.get();
>     Schema schm = reflectData.getSchema(A.class);
>     ReflectDatumWriter writer = new ReflectDatumWriter(schm);
>     ByteArrayOutputStream out = new ByteArrayOutputStream();
>     JsonEncoder json = new JsonEncoder(schm, out);
>     writer.write(e, json); /// only one call
>     // did not flush
>     byte[] bs = out.toByteArray();
>     int len = bs.length; // len == 0;  this is unexpected!
>     System.out.println("output size: " + len); 
>  
>     // flushed this time. this is a bit unwieldy
>     json.flush(); 
>     bs = out.toByteArray();
>     len = bs.length; // len == 18; this is better!
>     System.out.println("output size: " + len);
> }
> {code}
> One way to deal with it is to  have either all Encoders have flush method so the DatumWriter
can always flush it, and potentially add a flush method to DatumWriter as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message