spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenchen Fan (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-16475) Broadcast Hint for SQL Queries
Date Tue, 14 Feb 2017 22:12:41 GMT

     [ https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wenchen Fan reassigned SPARK-16475:
-----------------------------------

    Assignee: Reynold Xin

> Broadcast Hint for SQL Queries
> ------------------------------
>
>                 Key: SPARK-16475
>                 URL: https://issues.apache.org/jira/browse/SPARK-16475
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>              Labels: releasenotes
>             Fix For: 2.2.0
>
>         Attachments: BroadcastHintinSparkSQL.pdf
>
>
> Broadcast hint is a way for users to manually annotate a query and suggest to the query
optimizer the join method. It is very useful when the query optimizer cannot make optimal
decision with respect to join methods due to conservativeness or the lack of proper statistics.
> The DataFrame API has broadcast hint since Spark 1.5. However, we do not have an equivalent
functionality in SQL queries. We propose adding Hive-style broadcast hint to Spark SQL.
> For more information, please see the attached document. One note about the doc: in addition
to supporting "MAPJOIN", we should also support "BROADCASTJOIN" and "BROADCAST" in the comment,
e.g. the following should be accepted:
> {code}
> SELECT /*+ MAPJOIN(b) */ ...
> SELECT /*+ BROADCASTJOIN(b) */ ...
> SELECT /*+ BROADCAST(b) */ ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message