avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-803) Java generated Avro classes make using Avro painful and surprising
Date Thu, 15 Sep 2011 21:29:09 GMT

    [ https://issues.apache.org/jira/browse/AVRO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105697#comment-13105697

Doug Cutting commented on AVRO-803:

What's a concrete proposal?  Should we just switch generated code back to Utf8?

Or we might:
  - Use Utf8 for field values, permitting efficient reuse in loops that read data;
  - Provide getter methods that return String.  A Utf8 memoizes its string conversion, so
repeated calls to the getter would only allocate a single String.  Applications that wanted
to avoid that could use the field directly.
  - Provide setter methods that accept CharSequence.  This would check the runtime type and
convert String to Utf8.
  - For lists and maps, use wrappers to adapt w/o copying the entire list.  

class Foo {

  public Utf8 x;
  public String getX() {
     return x == null ? null : x.toString();
  public void setX(CharSequence x) {
     this.x = x instanceof Utf8 ? (Utf8)x : new Utf8(x.toString()); 

  public List<Utf8> values;
  public List<String> getValues() {
    return new AbstractList<String> {
      public String get(int i) {
         Utf8 value = values.get(i);
         return value == null ? null : value.toString();
      public int size() { return values.size(); }
  public void setValues(final List<? extends CharSequence> values) {
    this.values = new AbstractList<Utf8> {
      public Utf8 get(int i) {
         CharSequence value = values.get(i);
         if (value instanceof Utf8)
           return (Utf8)value;
         else if (value == null)
           return null;
           return new Utf8(value.toString());
      public int size() { return values.size(); }


> Java generated Avro classes make using Avro painful and surprising
> ------------------------------------------------------------------
>                 Key: AVRO-803
>                 URL: https://issues.apache.org/jira/browse/AVRO-803
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.5.0
>         Environment: Any
>            Reporter: Sam Pullara
>             Fix For: 1.6.0
> Currently the Avro generated Java classes expose CharSequence in their API. However,
you cannot use any old CharSequence when interacting with them. In fact, you have to use the
Utf8 class if you want to get consistent results. I think that Avro should work with any CharSequence
if that is the API. Here is an example where this happens:
> https://github.com/spullara/avro-generated-code/blob/master/src/test/java/AnnoyingTest.java
> That prints out 'false' three times unexpectedly. If you can't get it to print 'true'
three times then you should probably change it back to Utf8.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message