impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5309: [DOCS] Add TABLESAMPLE clause to SELECT statement
Date Wed, 23 Aug 2017 03:55:13 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-5309: [DOCS] Add TABLESAMPLE clause to SELECT statement

Patch Set 1:


Looks good, just minor comments
File docs/topics/impala_scalability.xml:

Line 863:   queries to understand the data distribution and plan a partitioning strategy,
I'd leave out the "to understand the data distribution and plan a partitioning strategy" because
that already supposes a certain use case in the user's mind. I'd not make any assumptions
about what the user wants to do with TABLESAMPLE.

Line 865:   to only a percentage of data within the table. This technique reduces the overhead
File docs/topics/impala_select.xml:

Line 175:         clause immediately after a table reference, to specify that the query only
processes an
a certain percentage of the table data? an "arbitrary portion" sounds strange and it's not
really completely arbitrary
File docs/topics/impala_tablesample.xml:

Line 57:       The <codeph>TABLESAMPLE</codeph> clause comes immediately after
a table name.
table name or alias, e.g.

from mytable t tablesample ...

Line 69:       processing a particular set of data files, the proportion of sampled data from
suggest "selecting a random set of data files" instead of "processing a particular set of
data files"

Line 77:       sampling considers the same set of data files each time. <codeph>REPEATABLE</codeph>
suggest "selects" instead of "considers"

Line 172:       by itself, because all phases of query execution use less data overall.
This is not necessarily true, depending on whether the small query optimization kicks in with

Line 257:       table metadata is not updated by a <codeph>REFRESH</codeph> 

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Idd7e5b7cfe11c986348bc6c8d1b11921f34df336
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Greg Rahn <>
Gerrit-Reviewer: John Russell <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-HasComments: Yes

View raw message