spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-15425) Disallow cartesian joins by default
Date Fri, 20 May 2016 02:11:13 GMT

     [ https://issues.apache.org/jira/browse/SPARK-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-15425:
------------------------------------

    Assignee:     (was: Apache Spark)

> Disallow cartesian joins by default
> -----------------------------------
>
>                 Key: SPARK-15425
>                 URL: https://issues.apache.org/jira/browse/SPARK-15425
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>
> It is fairly easy for users to shoot themselves in the foot if they run cartesian joins.
Often they might not even be aware of the join methods chosen. This happened to me a few times
in the last few weeks.
> It would be a good idea to disable cartesian joins by default, and require explicit enabling
of it via "crossJoin" method or in SQL "cross join". This however might be too large of a
scope for 2.0 given the timing. As a small and quick fix, we can just have a single config
option (spark.sql.join.enableCartesian) that controls this behavior. In the future we can
implement the fine-grained control.
> Note that the error message should be friendly and say "Set spark.sql.join.enableCartesian
to true to turn on cartesian joins."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message