impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jakub Kukul (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Date Fri, 02 Jun 2017 10:36:26 GMT
Jakub Kukul has posted comments on this change.

Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6550/4/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java:

PS4, Line 276:       // the original data type, before conversion to parquet, had been enum.
> nit: Long line. Lines should wrap before 90 characters are hit.
Done


PS4, Line 277: Ap
> nit: Please remove the extra leading spaces before 'types' (and in the line
Done


http://gerrit.cloudera.org:8080/#/c/6550/4/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

Line 144:   hadoop fs -put $SCHEMA_SRC_DIR/enum.parquet ${SCHEMA_DEST_DIR}/
> Please rename logicaltypes.parquet to enum.parquet. The current file name c
Fixed. This makes a lot of sense actually, since there's another file, called `decimal.parquet`
which contains decimal logical types.


http://gerrit.cloudera.org:8080/#/c/6550/4/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test:

Line 68: ====
> Add tests that runs a query against such a table, either in this file or so
I also thought about it, but eventually didn't do it because:
- This file contains only `CREATE` statements, so adding one `SELECT` statement wouldn't be
consistent. I didn't find any other good place to add this and I'm not sure if it's worth
adding dedicated test files for this.
- Once a table is created, querying a field that was originally annotated as ENUM will be
no different from querying a regular STRING field in parquet files and I guess there is already
test coverage for that.


-- 
To view, visit http://gerrit.cloudera.org:8080/6550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jakub Kukul <jakub.kukul@gmail.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Attila Jeges <attilaj@cloudera.com>
Gerrit-Reviewer: Jakub Kukul <jakub.kukul@gmail.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message