avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jamie Olson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1456) AvroAsTextInputFormat is inconsistent with the Avro JSON Encoding described in the Avro Specification
Date Wed, 12 Feb 2014 19:13:20 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899444#comment-13899444
] 

Jamie Olson commented on AVRO-1456:
-----------------------------------

For me, I can make things work either way.  The current json output is more concise at the
expense of some ambiguity.  I primarily wanted to verify that this is the intended behavior
and not a bug.

If things stay the way they are, it would probably be helpful to note the difference in the
documentation for AvroAsTextInputFormat.  Not knowing better, I'd assumed it'd use the same
encoding as the Avro Tools.

> AvroAsTextInputFormat is inconsistent with the Avro JSON Encoding described in the Avro
Specification
> -----------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-1456
>                 URL: https://issues.apache.org/jira/browse/AVRO-1456
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>            Reporter: Jamie Olson
>
> org.apache.avro.mapred.AvroAsTextInputFormat relies on the toString() method rather than
using org.apache.avro.generic.GenericDatumWriter.write() and org.apache.avro.io.JsonEncoder
as in org.apache.avro.tool.DataFileReadTool.  This results in a serialization of the data
element, without the fully qualified name as specified in the Avro Specifications JSON Encoding
section: http://avro.apache.org/docs/1.7.6/spec.html#json_encoding
> The specification indicates that for a union type: ["null","string","Foo"], data should
be serialized with:
> * null as null;
> * the string "a" as {"string": "a"}; and
> * a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding of a Foo
instance.
> Instead, AvroAsTextInputFormat is serializing these values as
> * null as null;
> * the string "a" as "a"; and
> * a Foo instance as {...}, where {...} indicates the JSON encoding of a Foo instance.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message