hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Islam" <misla...@yahoo.com>
Subject Re: Review Request 11925: Hive-3159 Update AvroSerde to determine schema of new tables
Date Tue, 23 Jul 2013 01:56:06 GMT


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java, line 67
> > <https://reviews.apache.org/r/11925/diff/3/?file=320914#file320914line67>
> >
> >     Should be a debug

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java, line 77
> > <https://reviews.apache.org/r/11925/diff/3/?file=320914#file320914line77>
> >
> >     spacing

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java, line 125
> > <https://reviews.apache.org/r/11925/diff/3/?file=320914#file320914line125>
> >
> >     Use j here instead, so as not to shadow the i.

Not clear. What is it shadowing? There is no other 'i'.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 36
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line36>
> >
> >     Some spacing on the table would help.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 77
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line77>
> >
> >     formatting

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 93
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line93>
> >
> >     fot -> for.  type info -> TypeInfo

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 98
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line98>
> >
> >     Is there any way to turn off the union wrapping?  If not, does it require its
own separate parameter?

'wrapUnion' flag is needed to avoid the duplicated "Union". Union schema does not required
to wrap with union. I think it is an exception, if it does.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 100
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line100>
> >
> >     formatting

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 109
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line109>
> >
> >     Move this above the first function so it's clear to readers.

Done.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 111
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line111>
> >
> >     Yeah, do that.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 125
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line125>
> >
> >     Doesn't this duplicate generateSchema above?

Removed


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 211
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line211>
> >
> >     can this be private?

It is used by Test case.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java, line
56
> > <https://reviews.apache.org/r/11925/diff/3/?file=320917#file320917line56>
> >
> >     delete

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java, line
121
> > <https://reviews.apache.org/r/11925/diff/3/?file=320917#file320917line121>
> >
> >     delete

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java, line
127
> > <https://reviews.apache.org/r/11925/diff/3/?file=320917#file320917line127>
> >
> >     odd name...

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java, line
26
> > <https://reviews.apache.org/r/11925/diff/3/?file=320917#file320917line26>
> >
> >     Either add msg string or break out into separate tests.

Done. Added msg.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java, line
244
> > <https://reviews.apache.org/r/11925/diff/3/?file=320916#file320916line244>
> >
> >     assert on the content of the exception, or subtype it and use the expects annotation.

Done. content checked .


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java, line
220
> > <https://reviews.apache.org/r/11925/diff/3/?file=320916#file320916line220>
> >
> >     nit: can this be made more readable?

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java, line
236
> > <https://reviews.apache.org/r/11925/diff/3/?file=320916#file320916line236>
> >
> >     Please separate out all the test cases into individual tests.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 189
> > <https://reviews.apache.org/r/11925/diff/3/?file=320915#file320915line189>
> >
> >     Hive TypeInfo allows for non-string keys, while Avro does not.  There should
be a check here for that.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java, line 116
> > <https://reviews.apache.org/r/11925/diff/3/?file=320914#file320914line116>
> >
> >     Rather than building the avro schema string by hand and then generating the
schema object, why not generate the avro schema and then run toString on it?  This guarantees
the schema will be correctly formed and type checked.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java, line 105
> > <https://reviews.apache.org/r/11925/diff/3/?file=320914#file320914line105>
> >
> >     We lose any column comments that may have been on the original schema.  Does
that normally happen with CTAS in Hive?

Normal CTAS also does not copy the comments.


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java, line
229
> > <https://reviews.apache.org/r/11925/diff/3/?file=320916#file320916line229>
> >
> >     Rather than hand-coding these, can we have Hive generate them?  This will catch
any changes if Hive changes how it generates them later on.

Done


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > ql/src/test/queries/clientpositive/avro_without_schema.q, line 26
> > <https://reviews.apache.org/r/11925/diff/3/?file=320909#file320909line26>
> >
> >     I don't see a test that exercises the generated schema through whole mapreduce
query, just ones that exercise the metadata store...

This test has the "group by".


> On July 12, 2013, 10:32 p.m., Jakob Homan wrote:
> > ql/src/test/queries/clientpositive/avro_create_as_select.q, line 3
> > <https://reviews.apache.org/r/11925/diff/3/?file=320906#file320906line3>
> >
> >     This test doesn't actually work on a non-avro table.  It just works on the definition
of a non-avro table.  We should have another one that does a select from a populated, non-avro-backed
table and verify the values are converted corrected.

There is a hive problem of creating internal column names such as  col0, col1 that creates
CTAS command to fail. This one needs to be resolved in a separate JIRA.


- Mohammad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11925/#review23102
-----------------------------------------------------------


On July 12, 2013, 6:49 p.m., Mohammad Islam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11925/
> -----------------------------------------------------------
> 
> (Updated July 12, 2013, 6:49 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jakob Homan.
> 
> 
> Bugs: HIVE-3159
>     https://issues.apache.org/jira/browse/HIVE-3159
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Problem:
> Hive doesn't support to create a Avro-based table using HQL create table command. It
currently requires to specify Avro schema literal or schema file name.
> For multiple cases, it is very inconvenient for user.
> Some of the un-supported use cases:
> 1. Create table ... <Avro-SERDE etc.> as SELECT ... from <NON-AVRO FILE>
> 2. Create table ... <Avro-SERDE etc.> as SELECT from <AVRO TABLE>
> 3. Create  table  without specifying Avro schema.
> 
> 
> Diffs
> -----
> 
>   ql/src/test/queries/clientpositive/avro_create_as_select.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_create_as_select2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_no_schema_test.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_without_schema.q PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_create_as_select.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_create_as_select2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_no_schema_test.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_without_schema.q.out PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 13848b6 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION

>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 010f614 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/11925/diff/
> 
> 
> Testing
> -------
> 
> Wrote a new java Test class for a new Java class. Added a new test case into existing
java test class. In addition, there are 4 .q file for testing multiple use-cases.
> 
> 
> Thanks,
> 
> Mohammad Islam
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message