avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Kay (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-1850) Align JSON and binary record serialization
Date Sat, 21 May 2016 19:57:12 GMT

     [ https://issues.apache.org/jira/browse/AVRO-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Kay updated AVRO-1850:
----------------------------
    Description: 
The documentation describes the encoding of Avro records as:

bq.Binary: A record is encoded by encoding the values of its fields in the order that they
are declared. In other words, a record is encoded as just the concatenation of the encodings
of its fields. Field values are encoded per their schema.
bq.JSON: Except for unions, the JSON encoding is the same as is used to encode field default
values.

The _field default values_ table says that records and maps are both encoded as JSON type
_object_.

*Enhancement:*
There is currently no way to write an Avro schema describing a JSON array of positional parameters
(i.e. an array containing variables of possibly different type).  An Avro record is the datatype
representing an ordered collection of values.  For consistency with the binary encoding, and
to allow Avro to represent a schema for JSON tuples, encoding should say:
bq.JSON: Except for unions and records, the JSON encoding is the same as is used to encode
field default values.  A record is encoded as an array by encoding the values of its fields
in the order that they are declared.

For the example schema:
{noformat}
{"namespace": "example.avro",
 "type": "record",
 "name": "User",
 "fields": [
     {"name": "name", "type": "string"},
     {"name": "favorite_number",  "type": ["int", "null"]},
     {"name": "favorite_color", "type": ["string", "null"]}
 ]
}
{noformat}
the JSON encoding currently converts an Avro record to an Avro map (JSON object):
{noformat}
{           "name": "Joe",
 "favorite_number": 42,
  "favorite_color": null  }
{noformat}
Instead Avro records should be encoded in JSON in the same manner as they are encoded in binary,
as a JSON array containing the fields in the order they are defined:
{noformat}
["Joe", 42, null]
{noformat}
The set of JSON texts validated by the example Avro schema and by the corresponding JSON schema
should be equal:
{noformat}
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "name": "User",
  "items": [
    {"name":"name", "type": "string"},
    {"name":"favorite_number", "type":["integer","null"]},
    {"name":"favorite_color", "type":["string","null"]}
  ]
}
{noformat}

  was:
The documentation describes the encoding of Avro records as:

bq.Binary: A record is encoded by encoding the values of its fields in the order that they
are declared. In other words, a record is encoded as just the concatenation of the encodings
of its fields. Field values are encoded per their schema.
bq.JSON: Except for unions, the JSON encoding is the same as is used to encode field default
values.

The _field default values_ table says that records and maps are both encoded as JSON type
_object_.

*Enhancement:*
There is currently no way to write an Avro schema describing a JSON array of positional parameters
(i.e. an array containing variables of possibly different type).  An Avro record is the datatype
representing an ordered collection of values.  For consistency with the binary encoding, and
to allow Avro to represent a schema for JSON tuples, encoding should say:
bq.JSON: Except for unions and records, the JSON encoding is the same as is used to encode
field default values.  A record is encoded as an array by encoding the values of its fields
in the order that they are declared.

For the example schema:
{noformat}
{"namespace": "example.avro",
 "type": "record",
 "name": "User",
 "fields": [
     {"name": "name", "type": "string"},
     {"name": "favorite_number",  "type": ["int", "null"]},
     {"name": "favorite_color", "type": ["string", "null"]}
 ]
}
{noformat}
the JSON encoding currently converts an Avro record to an Avro map (JSON object):
{noformat}
{           "name": "Joe",
 "favorite_number": 42,
  "favorite_color": null  }
{noformat}
Instead Avro records should be encoded in JSON in the same manner as they are encoded in binary,
as a JSON array containing the fields in the order they are defined:
{noformat}
["Joe", 42, null]
{noformat}


> Align JSON and binary record serialization
> ------------------------------------------
>
>                 Key: AVRO-1850
>                 URL: https://issues.apache.org/jira/browse/AVRO-1850
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: David Kay
>              Labels: Encoding, Record
>
> The documentation describes the encoding of Avro records as:
> bq.Binary: A record is encoded by encoding the values of its fields in the order that
they are declared. In other words, a record is encoded as just the concatenation of the encodings
of its fields. Field values are encoded per their schema.
> bq.JSON: Except for unions, the JSON encoding is the same as is used to encode field
default values.
> The _field default values_ table says that records and maps are both encoded as JSON
type _object_.
> *Enhancement:*
> There is currently no way to write an Avro schema describing a JSON array of positional
parameters (i.e. an array containing variables of possibly different type).  An Avro record
is the datatype representing an ordered collection of values.  For consistency with the binary
encoding, and to allow Avro to represent a schema for JSON tuples, encoding should say:
> bq.JSON: Except for unions and records, the JSON encoding is the same as is used to encode
field default values.  A record is encoded as an array by encoding the values of its fields
in the order that they are declared.
> For the example schema:
> {noformat}
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>      {"name": "name", "type": "string"},
>      {"name": "favorite_number",  "type": ["int", "null"]},
>      {"name": "favorite_color", "type": ["string", "null"]}
>  ]
> }
> {noformat}
> the JSON encoding currently converts an Avro record to an Avro map (JSON object):
> {noformat}
> {           "name": "Joe",
>  "favorite_number": 42,
>   "favorite_color": null  }
> {noformat}
> Instead Avro records should be encoded in JSON in the same manner as they are encoded
in binary, as a JSON array containing the fields in the order they are defined:
> {noformat}
> ["Joe", 42, null]
> {noformat}
> The set of JSON texts validated by the example Avro schema and by the corresponding JSON
schema should be equal:
> {noformat}
> {
>   "$schema": "http://json-schema.org/draft-04/schema#",
>   "type": "array",
>   "name": "User",
>   "items": [
>     {"name":"name", "type": "string"},
>     {"name":"favorite_number", "type":["integer","null"]},
>     {"name":"favorite_color", "type":["string","null"]}
>   ]
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message