avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymie Stata (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-50) bidrectional text representation of AVRO data
Date Tue, 21 Jul 2009 22:33:14 GMT

    [ https://issues.apache.org/jira/browse/AVRO-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733863#action_12733863

Raymie Stata commented on AVRO-50:

I'm not a big fan of introducing new methods into Encoder/Decoder.  Instead of introducing
the new method readMapKey (and writeMapKey), what if we put a new terminal MAPKEYMARKER into
the productions for map schemas.  It can be used to control the behavior of readString as

  public Utf8 readString(Utf8 old) throws IOException {
    String result;
    if (parser.topSymbol() == Symbol.MAPKEYMARKER) {
      if (in.getCurrentToken() == JsonToken.FIELD_NAME) {
        result = in.getText();
      } else {
        throw error("map-key");
    } else if (in.getCurrentToken() == JsonToken.VALUE_STRING) {
      result = in.getText();
    } else {
      throw error("string");
    return new Utf8(result);

> bidrectional text representation of AVRO data
> ---------------------------------------------
>                 Key: AVRO-50
>                 URL: https://issues.apache.org/jira/browse/AVRO-50
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Philip Zeyliger
>            Assignee: Doug Cutting
>         Attachments: AVRO-50.patch, AVRO-50.patch, AVRO-50.patch, AVRO-50.patch, AVRO-50.patch,
AVRO-50.sh, AVRO-50.sh, AVRO-50.sh
> It would be very useful to add a text representation of AVRO data to the spec, and implement
toString() and fromString() in all implementations.  Faced with binary data, it'll be a useful
operation to decode it for debugging, ad-hoc manipulation, etc.
> I suspect the text format will:
>  * be JSON
>  * require the schema for full interpretation
>  * map easily onto the binary format (if the binary format has a signifier to take a
specific branch of a union, the text format will have such a signifier as well)
>  * not be unique (there's more than one way to encode a given number (e.g., {{0x0 ==
0}}) or string (e.g., {{"\u0061" == "a"}}, not to mention flexible whitespace)
>  * be compatible, for the binary type, with whatever is decided in AVRO-36

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message