impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5359: [DOCS] Document SORT BY syntax for CREATE TABLE and ALTER TABLE
Date Thu, 25 May 2017 18:24:57 GMT
Lars Volker has posted comments on this change.

Change subject: IMPALA-5359: [DOCS] Document SORT BY syntax for CREATE TABLE and ALTER TABLE
......................................................................


Patch Set 1:

(5 comments)

Thank you for documenting this. Please see my inline comments. Let me know if you'd like to
discuss them in person.

http://gerrit.cloudera.org:8080/#/c/6981/1/docs/topics/impala_create_table.xml
File docs/topics/impala_create_table.xml:

Line 388:       <codeph>CREATE TABLE AS SELECT</codeph> operation. Creating data
files that are
I think it's important to understand that the source table property does not affect the target
table. CREATE TABLE TARGET AS SELECT * FROM SOURCE; with SOURCE having SORT BY() columns will
not copy them over to TARGET.


Line 389:       sorted is most useful for Parquet tables, where the metadata includes the
minimum and
Here it should be clear that the information is stored in the file metadata inside each Parquet
file, and not in our own metadata store.


Line 390:       maximum values for each column in each data file. Grouping data values together
Technically, statistics are stored per RowGroup. Impala only writes 1 rowgroup per file, but
that's a self imposed limitation.


Line 400:       evident with Parquet tables.
We could mention here, that other file formats don't have statistics inside the file metadata.


Line 412:       tools that creat HDFS files, Impala does not guarantee or rely on the data
being
typo


-- 
To view, visit http://gerrit.cloudera.org:8080/6981
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd571cd8840368edb327d16d27192458838ef234
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Alan Choi <alan@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message