avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-803) Java generated Avro classes make using Avro painful and surprising
Date Mon, 19 Sep 2011 19:22:10 GMT

    [ https://issues.apache.org/jira/browse/AVRO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108077#comment-13108077

Scott Carey commented on AVRO-803:

I like the idea of forcing String for map keys by default.  There is less value to using Utf8
there since Utf8 is mutable.  Plus, Strings in the JVM are heavily optimized for use as keys
in maps.  

I believe that users will sometimes want to force their value types to any of the three choices:
 String, Utf8, and CharSequence.  We should have reasonable defaults, but make it easier for
users to choose what they want for their use case.  For many use cases, direct conversion
to Strings is better.  For others, using Utf8 to be as lazy as possible in the expensive Utf8
<-> Utf16 conversion is.

Some extra laziness in Utf8 can help in a few places as well, by making the cost of creating
a Utf8 from a String cheaper.
Right now, Utf8 when called with a String parameter in the constructor is not lazy, and generates
the utf8 byte[].  It could leave this null, and only lazily create the byte[] if needed, just
like it lazily creates the String only if needed.

> Java generated Avro classes make using Avro painful and surprising
> ------------------------------------------------------------------
>                 Key: AVRO-803
>                 URL: https://issues.apache.org/jira/browse/AVRO-803
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.5.0
>         Environment: Any
>            Reporter: Sam Pullara
>             Fix For: 1.6.0
>         Attachments: Foo.java
> Currently the Avro generated Java classes expose CharSequence in their API. However,
you cannot use any old CharSequence when interacting with them. In fact, you have to use the
Utf8 class if you want to get consistent results. I think that Avro should work with any CharSequence
if that is the API. Here is an example where this happens:
> https://github.com/spullara/avro-generated-code/blob/master/src/test/java/AnnoyingTest.java
> That prints out 'false' three times unexpectedly. If you can't get it to print 'true'
three times then you should probably change it back to Utf8.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message