hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yibing Shi <shi.yib...@gmail.com>
Subject Re: Review Request 49952: HIVE-14205: Hive doesn't support union type with AVRO file format
Date Wed, 13 Jul 2016 10:49:48 GMT


> On July 13, 2016, 11:11 a.m., Chaoyu Tang wrote:
> > Thanks [~yshi] for patch. It looks good. But I have a couple of questions:
> > It seems to me that the union in existing code is only used to support Nullable
type in Avro, and has not been fully supported as a data type in general. This patch actually
extends (or adds) this type support. 
> > So with the patch, how can we be able to distinguish an Avro union between nullable
and non-nullable, for example, for following field schema, both might end with type uniontype<int,
bigint>
> > {code}
> >       "fields":[
> >            {
> >              "name":"value",
> >              "type":[
> >                 "null",
> >                  "int",
> >                  "long"
> >               ],
> >               "default":null
> >         ]
> > ---
> >       "fields":[
> >            {
> >              "name":"value",
> >              "type":[
> >                  "int",
> >                  "long"
> >               ],
> >               "default": 0
> >         ]
> > {code}
> > Will there be any problem? Also could we add some qtests using Avro union data (with
or without null)?

Hi [~ctang], thanks for the review!
Your concern about that both nullable and non-nullable avro union may end with same union
type in Hive is very sound. My understanding is that every column in Hive is nullalbe (there
isn't any key word like "not null" or "primary key" in Hive). As a result, schema ["null",
"int", "long"] should always be used in favor of ["int", "long"]. The latter is supported
by Hive just for better compatibility. So, it should be OK to map both ["null", "int", "long"]
and ["int", "long"] to "uniontype<int,long>"
Please let me know your opinions.

I will try to add qtests as you suggested.


- Yibing


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49952/#review141997
-----------------------------------------------------------


On July 12, 2016, 9:07 p.m., Yibing Shi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49952/
> -----------------------------------------------------------
> 
> (Updated July 12, 2016, 9:07 p.m.)
> 
> 
> Review request for hive and Chaoyu Tang.
> 
> 
> Bugs: HIVE-14205
>     https://issues.apache.org/jira/browse/HIVE-14205
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-14205: Hive doesn't support union type with AVRO file format
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 6165138 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 08ee62b 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 986b803

>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 0013b78 
> 
> Diff: https://reviews.apache.org/r/49952/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Yibing Shi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message