avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincenz Priesnitz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1341) Allow controlling avro via java annotations when using reflection.
Date Fri, 26 Jul 2013 17:19:50 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720966#comment-13720966
] 

Vincenz Priesnitz commented on AVRO-1341:
-----------------------------------------

You are right. The patch made record reading and writing take about twice as long. 
Here is the reflection performance of the trunk: 
{noformat}
                                                   test name     time    M entries/sec   M
bytes/sec  bytes/cycle
                         ReflectRecordRead:   5646 ms       2.952       114.543        808498
                        ReflectRecordWrite:   3537 ms       4.711       182.822        808498
                      ReflectBigRecordRead:   6044 ms       1.654       101.558        767380
                     ReflectBigRecordWrite:   4222 ms       2.368       145.384        767380
                          ReflectFloatRead:   5519 ms       0.000       144.932       1000004
                         ReflectFloatWrite:   1210 ms       0.001       660.832       1000004
                         ReflectDoubleRead:   7310 ms       0.000       218.876       2000004
                        ReflectDoubleWrite:   2190 ms       0.000       730.585       2000004
                       ReflectIntArrayRead:   8980 ms       1.856        76.589        859709
                      ReflectIntArrayWrite:   2707 ms       6.156       254.031        859709
                      ReflectLongArrayRead:   4569 ms       1.824       140.991        805344
                     ReflectLongArrayWrite:   1781 ms       4.677       361.609        805344
                    ReflectDoubleArrayRead:   5396 ms       1.853       121.281        818144
                   ReflectDoubleArrayWrite:   1652 ms       6.051       396.060        818144
                     ReflectFloatArrayRead:   9788 ms       2.043        69.156        846172
                    ReflectFloatArrayWrite:   2309 ms       8.661       293.156        846172
               ReflectNestedFloatArrayRead:  11524 ms       1.735        58.738        846172
              ReflectNestedFloatArrayWrite:   4506 ms       4.438       150.199        846172
              ReflectNestedObjectArrayRead:   9895 ms       0.404        52.156        645104
             ReflectNestedObjectArrayWrite:   5745 ms       0.696        89.822        645104
          ReflectNestedLargeFloatArrayRead:   7262 ms       0.459       119.783       1087381
         ReflectNestedLargeFloatArrayWrite:   2006 ms       1.661       433.513       1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7401 ms       0.450       119.034       1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   4797 ms       0.695       183.666       1101357
{noformat}
With the patch applied: 
{noformat}
                                                   test name     time    M entries/sec   M
bytes/sec  bytes/cycle
                         ReflectRecordRead:   9332 ms       1.786        69.305        808498
                        ReflectRecordWrite:   7412 ms       2.248        87.252        808498
                      ReflectBigRecordRead:   9533 ms       1.049        64.392        767380
                     ReflectBigRecordWrite:   8132 ms       1.230        75.487        767380
                          ReflectFloatRead:   5432 ms       0.000       147.256       1000004
                         ReflectFloatWrite:   1172 ms       0.001       682.323       1000004
                         ReflectDoubleRead:   6885 ms       0.000       232.387       2000004
                        ReflectDoubleWrite:   2303 ms       0.000       694.613       2000004
                       ReflectIntArrayRead:   8244 ms       2.022        83.426        859709
                      ReflectIntArrayWrite:   2517 ms       6.619       273.148        859709
                      ReflectLongArrayRead:   4534 ms       1.838       142.076        805344
                     ReflectLongArrayWrite:   1729 ms       4.819       372.619        805344
                    ReflectDoubleArrayRead:   4999 ms       2.000       130.928        818144
                   ReflectDoubleArrayWrite:   1431 ms       6.985       457.167        818144
                     ReflectFloatArrayRead:   9139 ms       2.188        74.066        846172
                    ReflectFloatArrayWrite:   2401 ms       8.329       281.898        846172
               ReflectNestedFloatArrayRead:  12295 ms       1.627        55.056        846172
              ReflectNestedFloatArrayWrite:   4975 ms       4.020       136.058        846172
              ReflectNestedObjectArrayRead:  14627 ms       0.273        35.281        645104
             ReflectNestedObjectArrayWrite:  10045 ms       0.398        51.375        645104
          ReflectNestedLargeFloatArrayRead:   7315 ms       0.456       118.910       1087381
         ReflectNestedLargeFloatArrayWrite:   2029 ms       1.642       428.657       1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7429 ms       0.449       118.597       1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   5330 ms       0.625       165.280       1101357
{noformat}
I added the proposed booleans to FieldAccessor and this improved performance almost back to
prepatch:
{noformat}
                                                   test name     time    M entries/sec   M
bytes/sec  bytes/cycle
                         ReflectRecordRead:   6391 ms       2.607       101.189        808498
                        ReflectRecordWrite:   4180 ms       3.987       154.712        808498
                      ReflectBigRecordRead:   6276 ms       1.593        97.812        767380
                     ReflectBigRecordWrite:   4926 ms       2.030       124.610        767380
                          ReflectFloatRead:   5580 ms       0.000       143.356       1000004
                         ReflectFloatWrite:   1285 ms       0.001       622.420       1000004
                         ReflectDoubleRead:   6847 ms       0.000       233.657       2000004
                        ReflectDoubleWrite:   2325 ms       0.000       688.114       2000004
                       ReflectIntArrayRead:   7973 ms       2.090        86.252        859709
                      ReflectIntArrayWrite:   2760 ms       6.038       249.168        859709
                      ReflectLongArrayRead:   4720 ms       1.765       136.489        805344
                     ReflectLongArrayWrite:   1762 ms       4.728       365.527        805344
                    ReflectDoubleArrayRead:   5253 ms       1.903       124.587        818144
                   ReflectDoubleArrayWrite:   1637 ms       6.107       399.693        818144
                     ReflectFloatArrayRead:   9280 ms       2.155        72.942        846172
                    ReflectFloatArrayWrite:   2182 ms       9.163       310.143        846172
               ReflectNestedFloatArrayRead:  11072 ms       1.806        61.134        846172
              ReflectNestedFloatArrayWrite:   4058 ms       4.928       166.812        846172
              ReflectNestedObjectArrayRead:  11122 ms       0.360        46.399        645104
             ReflectNestedObjectArrayWrite:   6689 ms       0.598        77.152        645104
          ReflectNestedLargeFloatArrayRead:   7320 ms       0.455       118.834       1087381
         ReflectNestedLargeFloatArrayWrite:   1837 ms       1.814       473.434       1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7383 ms       0.451       119.326       1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   4839 ms       0.689       182.069       1101357
{noformat}

Attached is a new patch with the improved performance.

                
> Allow controlling avro via java annotations when using reflection. 
> -------------------------------------------------------------------
>
>                 Key: AVRO-1341
>                 URL: https://issues.apache.org/jira/browse/AVRO-1341
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Vincenz Priesnitz
>            Assignee: Vincenz Priesnitz
>             Fix For: 1.7.5
>
>         Attachments: AVRO-1341.patch, AVRO-1341.patch, AVRO-1341.patch, AVRO-1341.patch,
AVRO-1341.patch
>
>
> It would be great if one could control avro with java annotations. As of now, it is already
possible to mark fields as Nullable or classes being encoded as a String. I propose a bigger
set of annotations to control the behavior of avro on fields and classes. Such annotations
have proven useful with jacksons json serialization and morphias mongoDB serialization.
> I propose the following additional annotations: 
> @AvroName("alternativeName")
> @AvroAlias(alias="alias", space="space")
> @AvroIgnore
> @AvroMeta(key="K", value="V")
> @AvroEncode(using=CustomEncoding.class)
> Java fields with the @AvroName("alternativeName") annotation will be renamed in the induced
schema. When reading an avro file via reflection, the reflection reader will look for fields
in the schema with "alternativeName". 
> For example:
> {code}
>    @AvroName("foo")
>    int bar;  
> {code}
> is serialized as
> {code}
>   { "name" : "foo", "type" : "int" } 
> {code}
> The @AvroAlias annotation will add a new alias to the induced schema of a record, enum
or field. The space parameter is optional and defaults to the namespace of the named schema
the alias is added to.
> Fields with the @AvroIgnore annotation will be treated as if they had a transient modifier,
i.e. they will not be written to or read from avro files. 
> The @AvroMeta(key="K", value="V") annotation allows you to store an arbitrary key : value
pair at every node in the schema.
> {code}
>    @AvroMeta(key="fieldKey", value="fieldValue")
>    int foo;  
> {code}
> will create the following schema
> {code}
> {"name" : "foo", "type" : "int", "fieldKey" : "fieldValue" } 
> {code}
> Fields can be custom encoded with the AvroEncode(using=CustomEncoding.class) annotation.
This annotation is a generalization of the @Stringable annotation. The @Stringable annotation
is limited to classes with string argument constructors. Some classes can be similarly reduced
to a smaller class or even a single primitive, but dont fit the requirements for @Stringable.
A prominent example is java.util.Date, which instances can essentially be described with a
single long. Such classes can now be encoded with a CustomEncoding, which reads and writes
directly from the encoder/decoder. 
> One simply extends the abstract CustomEncodings class by implementing a schema, a read
method and a write method. A java field can then be annotated like this:
> {code}
> @AvroEncode(using=DateAslongEncoding.class)
> Date date;
> {code}
> The custom encoding implementation would look like 
> {code}
> public class DateAsLongEncoding extends CustomEncoding<Date> {
>   {
>     schema = Schema.create(Schema.Type.LONG);
>     schema.addProp("CustomEncoding", "DateAsLongEncoding");
>   }
>   
>   @Override
>   public void write(Object datum, Encoder out) throws IOException {
>     out.writeLong(((Date)datum).getTime());
>   }
>   
>   @Override
>   public Date read(Object reuse, Decoder in) throws IOException {
>     if (reuse != null) {
>       ((Date)reuse).setTime(in.readLong());
>       return (Date)reuse;
>     }
>     else return new Date(in.readLong());
>   }
> }
> {code}
> I implemented said annotations and a custom encoding for java.util.Date as a proof of
concept and also extended the @Stringable annotations to fields.
> This issue is a followup of AVRO-1328 and AVRO-1330.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message