avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liviu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-2088) Decimal logicalType values serialized in hexidecimal vs decimal
Date Fri, 06 Oct 2017 12:57:00 GMT

    [ https://issues.apache.org/jira/browse/AVRO-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194550#comment-16194550

liviu commented on AVRO-2088:

Hi [~zi],

But when I read the avro files through Hive (using hive external table), in both cases the
corresponding datatype in Hive is "decimal" and data displayed correctly in Hive 
So, with same avro schema, Hive is able to deserialize both values ("\u00018" and "3.12")
 to "3.12"
If this a datastage bug, then also hive serve is not compliant?

> Decimal logicalType values serialized in hexidecimal vs decimal
> ---------------------------------------------------------------
>                 Key: AVRO-2088
>                 URL: https://issues.apache.org/jira/browse/AVRO-2088
>             Project: Avro
>          Issue Type: Task
>            Reporter: liviu
> We use this schema for AVRO file:
> {code:java}
> "name":"col1",
> "type":["null",
> 	{
> 		"type":"bytes",
> 		"logicalType":"decimal",
> 		"precision":19,
> 		"scale":2
> 	}
> 	]
> {code}
> - if we save data in avro using sqoop or hive (external table), the values are saved
in hexadecimal format (ex. for 3.12 value is: {color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
> - if we save the data in that avro file using datastage , the values are saved in decimal
format (ex. for 3.12 the saved value is: {color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}
> The questions are:
> 1). why there is this differences, in one case the data is serialised using hexidecimal
and the other case using decimal? 
> 2). are these differences caused by Avro serialization encoding used (for one case is
used binary encoding, for the other case is used json encoding)?
> 3). how can we control how the values are serialized (ex. we want to have them as "3.12"
instead of "\u00018")
> Thanks,
> Liviu

This message was sent by Atlassian JIRA

View raw message