spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srinath (JIRA)" <>
Subject [jira] [Commented] (SPARK-17298) Require explicit CROSS join for cartesian products
Date Mon, 29 Aug 2016 18:33:20 GMT


Srinath commented on SPARK-17298:

You are correct that with this change, queries of the form
select * from A inner join B
will now throw an error where previously they would not. 
The reason for this suggestion is that users may often forget to specify join conditions altogether,
leading to incorrect, long-running queries. Requiring explicit cross joins helps clarify intent.

Turning on the spark.sql.crossJoin.enabled flag will revert to previous behavior.

> Require explicit CROSS join for cartesian products
> --------------------------------------------------
>                 Key: SPARK-17298
>                 URL:
>             Project: Spark
>          Issue Type: Story
>          Components: SQL
>            Reporter: Srinath
>            Priority: Minor
> Require the use of CROSS join syntax in SQL (and a new crossJoin DataFrame API) to specify
explicit cartesian products between relations.
> By cartesian product we mean a join between relations R and S where there is no join
condition involving columns from both R and S.
> If a cartesian product is detected in the absence of an explicit CROSS join, an error
must be thrown. Turning on the spark.sql.crossJoin.enabled configuration flag will disable
this check and allow cartesian products without an explicit cross join.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message