impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jakub Kukul (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Date Thu, 18 May 2017 10:21:50 GMT
Jakub Kukul has posted comments on this change.

Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6550/3/docs/topics/impala_parquet.xml
File docs/topics/impala_parquet.xml:

PS3, Line 1154: ENUM
> Are enums logical types?
Yes, enums are logical types. The documentation for it has been missing, but I recently opened
a PR to fix this:
https://github.com/apache/parquet-format/pull/54


http://gerrit.cloudera.org:8080/#/c/6550/3/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java:

Line 274:       // UTF8 is the type annotation Parquet uses for strings
> This comment needs to be updated. It would be good if you put a link to a s
Done


http://gerrit.cloudera.org:8080/#/c/6550/3/testdata/data/schemas/logicaltypes.parquet
File testdata/data/schemas/logicaltypes.parquet:

> How did you generate this file? Was it with Hive?
I generated this file from a protobuf file, using https://github.com/Parquet/parquet-mr/blob/master/parquet-protobuf/src/main/java/parquet/proto/ProtoParquetWriter.java.


http://gerrit.cloudera.org:8080/#/c/6550/2/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test:

Line 57: create table $DATABASE.like_logicaltypes_file like parquet
> Please indicate here what the logic types in question are.
Done


Line 66: ---- TYPES
> You'll also want to see that SELECT works, I think.
This file only contains queries that are testing table creation. Such a test probably doesn't
belong here.

Also, I am not sure if such a test is within the scope of this ticket. We just want to make
sure that parquet columns which are annotated with ENUM logical type, e.g.:
```
optional binary string_col (ENUM);
```
will end up as string columns in impala table definition, just like it is the case for un-annotated
parquet columns, e.g.:
```
optional binary string_col;
```

When an impala table is created, these columns become regular string columns and there are
already several tests for querying string columns, I think.


-- 
To view, visit http://gerrit.cloudera.org:8080/6550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jakub Kukul <jakub.kukul@gmail.com>
Gerrit-Reviewer: Jakub Kukul <jakub.kukul@gmail.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message