avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1316) IDL code-generation generates too-long literals for very large schemas
Date Thu, 02 May 2013 02:23:13 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647205#comment-13647205
] 

Scott Carey commented on AVRO-1316:
-----------------------------------

I have not, but my schemas are only ~12K.

I assume the problem is in the creation of the SCHEMA$ static field?

We could break the string up into 4k chunks.

However it will be more efficient and significantly less resulting class file size if we use
the Schema API programatically.

This isn't too hard.

We go from the below (edited from one line to many for readability):
{code}
  public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse(
  "{\"type\":\"record\",\"name\":\"HandshakeRequest\",\"namespace\":\"org.apache.avro.ipc\",\"fields\":[
    {\"name\":\"clientHash\",\"type\":{\"type\":\"fixed\",\"name\":\"MD5\",\"size\":16}},
    {\"name\":\"clientProtocol\",\"type\":[\"null\",{\"type\":\"string\",\"avro.java.string\":\"String\"}]},
    {\"name\":\"serverHash\",\"type\":\"MD5\"},
    {\"name\":\"meta\",\"type\":[\"null\",{\"type\":\"map\",\"values\":\"bytes\",\"avro.java.string\":\"String\"}]}
  ]}");
{code}

to use the new SchemaBuilder:
{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
    SCHEMA$ = SchemaBuilder
      .recordType("HandshakeRequest")
      .namespace("org.apache.avro.ipc")
      .requiredFixed("clientHash", MD5.SCHEMA$)
      .unionType("clientProtocol", SchemaBuilder.unionType(
          SchemaBuilder.NULL,
          SchemaBuilder.STRING)
          .build())
          .addProp("avro.java.string", "String")
      .requiredFixed("serverHash", MD5.SCHEMA$)
      .unionType("meta", SchemaBuilder.unionType(
          SchemaBuilder.NULL,
          SchemaBuilder.mapType(SchemaBuilder.BYTES)
            .addProp("avro.java.string", "String")
            .build())
          .build())
      .build();
  }
{code}

                
> IDL code-generation generates too-long literals for very large schemas
> ----------------------------------------------------------------------
>
>                 Key: AVRO-1316
>                 URL: https://issues.apache.org/jira/browse/AVRO-1316
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Jeremy Kahn
>            Priority: Minor
>
> When I work from a very large IDL schema, the Java code generated includes a schema JSON
literal that exceeds the length of the maximum allowed literal string ([65535 characters|http://stackoverflow.com/questions/8323082/size-of-initialisation-string-in-java]).
 
> This creates weird Maven errors like: {{[ERROR] ...FooProtocol.java:[13,89] constant
string too long}}.
> It might seem a little crazy, but a 64-kilobyte JSON protocol isn't outrageous at all
for some of the more involved data structures, especially if we're including documentation
strings etc.
> I believe the fix should be a bit more sensitivity to the length of the JSON literal
(and a willingness to split it into more than one literal, joined by {{+}}), but I haven't
figured out where that change needs to go. Has anyone else encountered this problem?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message