impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mostafa Mokhtar (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Date Thu, 06 Jul 2017 19:01:06 GMT
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7300/1/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 48:       Impala uses the <q>broadcast</q> technique that transmits the entire
contents
> What is the answer if both tables are missing stats? Does Impala make a ded
If both tables are missing stats the table listed first in the query will be the probe side
while the second table will be broadcasted.


Line 61:       from each table to each executor node.
> I'd prefer to prepare and fine-tune a brief explanation so I could reuse th
This is the description for the SHUFFLE join, we should use similar wording

[SHUFFLE] - Makes that join operation use the "partitioned" technique, which divides up corresponding
rows from both tables using a hashing algorithm, sending subsets of the rows to other nodes
for processing. (The keyword SHUFFLE is used to indicate a "partitioned join", because that
type of join is not related to "partitioned tables".) Since the alternative "broadcast" join
mechanism is the default when table and index statistics are unavailable, you might use this
hint for queries where broadcast joins are unsuitable; typically, partitioned joins are more
efficient for joins between large tables of similar size.


http://gerrit.cloudera.org:8080/#/c/7300/2/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 40:       This option determines the join strategy that Impala uses when any of the tables
Alex's comment around not using "Join strategy" hasn't been addressed. 

Can you please use "join distribution" instead?


-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message