avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Blue (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AVRO-1891) Generated Java code fails with union containing logical type
Date Mon, 05 Sep 2016 16:42:21 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463687#comment-15463687
] 

Ryan Blue edited comment on AVRO-1891 at 9/5/16 4:42 PM:
---------------------------------------------------------

I think all this requires is keeping a set of conversions that should be applied when reading
or writing a specific class. Unlike generic where the conversions are determined by the data
model at runtime, the conversions that should be used for a specific class are determined
at compile time. We have the benefit of knowing that the compiler either added conversions
for all instances of a logical type, or for none of them. So we only need to know the set
of conversions the compiler had set up when a class was compiled.

Rather than relying on the set of conversions the SpecificData instance has configured, I
think we should keep the set of conversions for the class being written or read. So we don't
need to change how SpecificData looks up conversions, just the way the SpecificDatumReader/Writer
does to avoid looking them up in the data model. (I agree with Doug and don't see an advantage
of adding a conversion resolver.)

What about this:

* Maintain a thread-local reference to the current specific record class in SpecificDatumReader
* Add a static conversion map to each specific class with its conversions (generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the current record
class's set of conversions instead.

This way, there are no changes to how the data model lookups work, little generated code (just
an annotation to conversion map), and few changes to the datum reader and writers. What do
you guys think? I think this would be a bit smaller patch. I'll try to put it together tomorrow
if I have time.


was (Author: rdblue):
I think all this requires is keeping a set of conversions that should be applied when reading
or writing a specific class. Unlike generic where the conversions are determined by the data
model at runtime, the conversions that should be used for a specific class are determined
at compile time. We have the benefit of knowing that the compiler either added conversions
for all instances of a logical type, or for none of them. So we only need to know the set
of conversions the compiler had set up when a class was compiled.

Rather than relying on the set of conversions the SpecificData instance has configured, I
think we should keep the set of conversions for the class being written or read. So we don't
need to change how SpecificData looks up conversions, just the way the SpecificDatumReader/Writer
does to avoid looking them up in the data model. (I agree with Doug and don't see an advantage
of adding a conversion resolver.)

What about this:

* Maintain a thread-local reference to the current specific record class in GenericDatumReader
* Add a static conversion map to each specific class with its conversions (generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the current record
class's set of conversions instead.

This way, there are no changes to how the data model lookups work, little generated code (just
an annotation to conversion map), and few changes to the datum reader and writers. What do
you guys think? I think this would be a bit smaller patch. I'll try to put it together tomorrow
if I have time.

> Generated Java code fails with union containing logical type
> ------------------------------------------------------------
>
>                 Key: AVRO-1891
>                 URL: https://issues.apache.org/jira/browse/AVRO-1891
>             Project: Avro
>          Issue Type: Bug
>          Components: java, logical types
>    Affects Versions: 1.8.1
>            Reporter: Ross Black
>            Priority: Blocker
>             Fix For: 1.8.3
>
>         Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, AVRO-1891.yshi.2.patch,
AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch
>
>
> Example schema:
> {code}
>     {
>       "type": "record",
>       "name": "RecordV1",
>       "namespace": "org.brasslock.event",
>       "fields": [
>         { "name": "first", "type": ["null", {"type": "long", "logicalType":"timestamp-millis"}]}
>       ]
>     }
> {code}
> The avro compiler generates a field using the relevant joda class:
> {code}
>     public org.joda.time.DateTime first
> {code}
> Running the following code to perform encoding:
> {code}
>         final RecordV1 record = new RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z"));
>         final DatumWriter<RecordV1> datumWriter = new SpecificDatumWriter<>(record.getSchema());
>         final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192);
>         final BinaryEncoder encoder = EncoderFactory.get().directBinaryEncoder(stream,
null);
>         datumWriter.write(record, encoder);
>         encoder.flush();
>         final byte[] bytes = stream.toByteArray();
> {code}
> fails with the exception stacktrace:
> {code}
>  org.apache.avro.AvroRuntimeException: Unknown datum type org.joda.time.DateTime: 2016-07-29T10:15:30.000Z
>     at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741)
>     at org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293)
>     at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706)
>     at org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
>     at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
>     at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87)
>     at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
>     at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
>     at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>     at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
>     at org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>     at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>     at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>     at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117)
>     at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42)
>     at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:253)
>     at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> {code}
> The failure can be fixed by explicitly adding the relevant conversion(s) to DatumWriter
/ SpecificData:
> {code}
>         final RecordV1 record = new RecordV1(DateTime.parse("2007-12-03T10:15:30.00Z"));
>         final SpecificData specificData = new SpecificData();
>         specificData.addLogicalTypeConversion(new TimeConversions.TimestampConversion());
>         final DatumWriter<RecordV1> datumWriter = new SpecificDatumWriter<>(record.getSchema(),
specificData);
>         final ByteArrayOutputStream stream = new ByteArrayOutputStream(AvroUtil.DEFAULT_BUFFER_SIZE);
>         final BinaryEncoder encoder = EncoderFactory.get().directBinaryEncoder(stream,
null);
>         datumWriter.write(record, encoder);
>         encoder.flush();
>         final byte[] bytes = stream.toByteArray();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message