avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Blue (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays
Date Mon, 14 Dec 2015 20:55:46 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056701#comment-15056701

Ryan Blue commented on AVRO-1584:

It looks like the conversion used for default values is independent of toString. Callers can
pass either a JsonNode, which bypasses the problem, or an object that gets [converted in JacksonUtils|https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L73].
That converts a byte array to a string using ISO-8859-1, which correctly implements the spec.
When the JSON is written, the characters that aren't allowed in JSON strings are escaped by
the generator. Changing the output of toString won't break the case that Doug mentions, but
I think it is a fair point that changing what is currently produced could break applications.

However, the JSON currently produced by toString is broken because it doesn't convert control
characters to escape sequences (0x0a to \n). We could safely fix that problem without moving
to base64 and I think at a minimum we should do that.

But this still leaves a problem: what do we do about toString not conforming to the JSON required
by the Avro spec?

> Json output doesn't generate base64 for byte arrays
> ---------------------------------------------------
>                 Key: AVRO-1584
>                 URL: https://issues.apache.org/jira/browse/AVRO-1584
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.7
>         Environment: Pure java.
>            Reporter: Christophe Lorenz
>         Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, AVRO-1584.patch
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [     {"name": "data", "type": "bytes"} ]
> }
> The toString()  
> 	System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back and forth
to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}

This message was sent by Atlassian JIRA

View raw message