avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Pigott <mpigott.subscripti...@gmail.com>
Subject Re: How to get Specific classes to expose BigDecimal fields
Date Wed, 12 Nov 2014 03:11:35 GMT
You're welcome!  I'm glad I was able to help.  If you find a better
long-term solution, feel free to offer it to AVRO-1497!  -Mike

On Tue Nov 11 2014 at 3:44:26 AM Fady <fady@legsem.com> wrote:

>
> Thank you Mike for taking the time to reply to this,
>
> I looked at your code and applied the AVRO-457 patch you did. Indeed you
> fixed a very similar problem. In your case XMLSchema delivers and expects
> BigDecimals so you mapped that to ByteBuffer as specified in
> http://avro.apache.org/docs/1.7.7/spec.html#Decimal.
>
> As for the SpecificCompiler, I ended up creating s custom compiler:
>
>       /**
>        * Temporary workaround for the lack of support for BigDecimal in
> Avro Specific Compiler.
>        * <p/>
>        * The record.vm template is customized to expose BigDecimal getters
> and setters.
>        *
>        */
>       public class CustomSpecificCompiler extends SpecificCompiler {
>
>           private static final String TEMPLATES_PATH =
> "/com/legstar/avro/generator/specific/templates/java/classic/";
>
>           public CustomSpecificCompiler(Schema schema) {
>               super(schema);
>               setTemplateDir(TEMPLATES_PATH);
>           }
>
>           /**
>            * In the case of BigDecimals there is an internal java type
> (ByteBuffer)
>            * and an external java type for getters/setters.
>            *
>            * @param schema the field schema
>            * @return the field java type
>            */
>           public String externalJavaType(Schema schema) {
>               return isBigDecimal(schema) ? "java.math.BigDecimal" : super
>                       .javaType(schema);
>           }
>
>           /** Tests whether a field is to be externalized as a BigDecimal
> */
>           public static boolean isBigDecimal(Schema schema) {
>               if (Type.BYTES == schema.getType()) {
>                   JsonNode logicalTypeNode =
> schema.getJsonProp("logicalType");
>                   if (logicalTypeNode != null
>                           && "decimal".equals(logicalTypeNode.asText())) {
>                       return true;
>                   }
>               }
>               return false;
>           }
>
>       }
>
> And then changed the record.vm velocity template like this:
>
>     72c72
>     <   public ${this.mangle($schema.getName())}(#foreach($field in
> $schema.getFields())${this.externalJavaType($field.schema())}
> ${this.mangle($field.name())}#if($velocityCount <
> $schema.getFields().size()), #end#end) {
>     ---
>     >   public ${this.mangle($schema.getName())}(#foreach($field in
> $schema.getFields())${this.javaType($field.schema())} ${this.mangle($
> field.name())}#if($velocityCount < $schema.getFields().size()), #end#end)
> {
>     74c74
>     <     ${this.generateSetMethod($schema, $field)}(${this.mangle($
> field.name())});
>     ---
>     >     this.${this.mangle($field.name())} = ${this.mangle($field.name
> ())};
>     110,113c110
>     <   public ${this.externalJavaType($field.schema())}
> ${this.generateGetMethod($schema, $field)}() {
>     < #if ($this.isBigDecimal($field.schema()))
>     <     return new java.math.BigDecimal(new
> java.math.BigInteger(${this.mangle($field.name())}.array()),
> $field.schema().getJsonProp("scale"));
>     < #else
>     ---
>     >   public ${this.javaType($field.schema())}
> ${this.generateGetMethod($schema, $field)}() {
>     115d111
>     < #end
>     124,127c120
>     <   public void ${this.generateSetMethod($schema,
> $field)}(${this.externalJavaType($field.schema())} value) {
>     < #if ($this.isBigDecimal($field.schema()))
>     <     this.${this.mangle($field.name(), $schema.isError())} =
> java.nio.ByteBuffer.wrap(value.unscaledValue().toByteArray());
>     < #else
>     ---
>     >   public void ${this.generateSetMethod($schema,
> $field)}(${this.javaType($field.schema())} value) {
>     129d121
>     < #end
>
> This fixes the issue for me but is not a good long term solution.
> Particularly the builder part of the generated Specific class is still
> exposing ByteBuffer instead of BigDecimal which is inconsistent.
>
> More generally, it seems to me a better solution would be that the
> "java-class" trick be extended so that more complex conversions can occur
> between the avro type and the java type exposed by Specific classes. Right
> now, the java type must be castable from the avro type which is limiting.
>
> Anyway, thanks again for your great insight.
>
>
> Fady
>
>
>
>
>
> On 11/11/2014 05:06, Michael Pigott wrote:
>
> Hi Fady,
>     Properly handling BigDecimal types in Java is still an open question.
> AVRO-1402 [1] added BigDecimal types to the Avro spec, but the Java support
> is an open ticket under AVRO-1497 [2].  When I added BigDecimal support to
> AVRO-457 (XML <-> Avro support), I added support for the Avro decimal
> logical type using Java BigDecimals.  You can see the conversion code [3]
> as well as the writer [4] and reader [5] code in my GitHub repository, or
> download the patch in AVRO-457 [6] and look for BigDecimal in the
> Utils.java, XmlDatumWriter.java, and XmlDatumReader.java files,
> respectively.
>
>  Good luck!
> Mike
>
>  [1] https://issues.apache.org/jira/browse/AVRO-1402
> [2] https://issues.apache.org/jira/browse/AVRO-1497
> [3]
> https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/Utils.java#L537
> [4]
> https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumWriter.java#L1150
> [5]
> https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumReader.java#L998
> [6] https://issues.apache.org/jira/browse/AVRO-457
>
> On Sat Nov 08 2014 at 4:11:32 AM Fady <fady@legsem.com> wrote:
>
>> Hello,
>>
>> I am working on a project that aims at converting Mainframe data to Avro
>> records (https://github.com/legsem/legstar.avro).
>>
>> Mainframe data often contains Decimal types. For these, I would like the
>> corresponding Avro records to expose BigDecimal fields.
>>
>> Initially, I followed the recommendation here:
>> http://avro.apache.org/docs/1.7.7/spec.html#Decimal. My schema contains
>> for instance:
>>
>>      {
>>        "name":"transactionAmount",
>>        "type":{
>>          "type":"bytes",
>>          "logicalType":"decimal",
>>          "precision":7,
>>          "scale":2
>>        }
>>      }
>>
>> This works fine but the Avro Specific record produced by the
>> SpecificCompiler exposes a ByteBuffer for that field.
>>
>>    @Deprecated public java.nio.ByteBuffer transactionAmount;
>>
>> Not what I want.
>>
>> I tried this alternative:
>>
>>      {
>>        "name":"transactionAmount",
>>        "type":{
>>          "type":"string",
>>          "java-class":"java.math.BigDecimal",
>>          "logicalType":"decimal",
>>          "precision":7,
>>          "scale":2
>>        }
>>
>> Now the SchemaCompiler produces the result I need:
>>
>>    @Deprecated public java.math.BigDecimal transactionAmount;
>>
>> There are 2 problems though:
>>
>> 1. It is less efficient to serialize/deserialize a BigDecimal from a
>> string rather then the 2's complement.
>>
>> 2. The Specific Record obtained this way cannot be populated using a
>> deep copy from a Generic Record.
>>
>> To clarify the second point:
>>
>> When I convert the mainframe data I do something like:
>>
>>          GenericRecord genericRecord = new GenericData.Record(schema);
>>          ... populate genericRecord from Mainframe data ...
>>          return (D) SpecificData.get().deepCopy(schema, genericRecord);
>>
>> This fails with :
>>          java.lang.ClassCastException: java.lang.String cannot be cast
>> to java.math.BigDecimal
>>              at
>> legstar.avro.test.specific.cusdat.Transaction.put(Transaction.java:47)
>>              at
>> org.apache.avro.generic.GenericData.setField(GenericData.java:573)
>>              at
>> org.apache.avro.generic.GenericData.setField(GenericData.java:590)
>>              at
>> org.apache.avro.generic.GenericData.deepCopy(GenericData.java:972)
>>              at
>> org.apache.avro.generic.GenericData.deepCopy(GenericData.java:926)
>>              at
>> org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)
>>              at
>> org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)
>>
>>
>> This is because the code in the Specific record assumes the value
>> received is already a BigDecimal
>>
>>      case 1: transactionAmount = (java.math.BigDecimal)value$; break;
>>
>> In other words, the java-class trick produces the right interface for
>> Specific classes but the internal data types are not consistent with the
>> GenericRecord derived from the same schema.
>>
>> So my question is: what would be a better approach for Specific classes
>> to expose BigDecimal fields?
>>
>>
>

Mime
View raw message