drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Muhammad Gelbana <m.gelb...@gmail.com>
Subject Re: Running cartesian joins on Drill
Date Sat, 06 May 2017 22:05:06 GMT
​​
Here it is:

SELECT * FROM (SELECT 'ABC' `UserID` FROM `dfs`.`path_to_parquet_file` tc
LIMIT 2147483647) `t0` INNER JOIN (SELECT 'ABC' `UserID` FROM
`dfs`.`path_to_parquet_file` tc LIMIT 2147483647) `t1` ON (
​​
`t0`.`UserID` IS NOT DISTINCT FROM
​​
`t1`.`UserID`) LIMIT 2147483647

I debugged Drill code and found it decomposes *IS NOT DISTINCT FROM* into
​
*`t0`.`UserID` = ​`t1`.`UserID` OR (`t0`.`UserID` IS NULL && `t1`.`UserID`
IS NULL**)* while checking if the query is a cartesian join, and when the
check returns true, it throws an excetion saying: *This query cannot be
planned possibly due to either a cartesian join or an inequality join*


*---------------------*
*Muhammad Gelbana*
http://www.linkedin.com/in/mgelbana

On Sat, May 6, 2017 at 6:53 PM, Gautam Parai <gparai@mapr.com> wrote:

> Can you please specify the query you are trying to execute?
>
>
> Gautam
>
> ________________________________
> From: Muhammad Gelbana <m.gelbana@gmail.com>
> Sent: Saturday, May 6, 2017 7:34:53 AM
> To: user@drill.apache.org; dev@drill.apache.org
> Subject: Running cartesian joins on Drill
>
> Is there a reason why Drill would intentionally reject cartesian join
> queries even if *planner.enable_nljoin_for_scalar_only* is disabled ?
>
> Any ideas how could a query be rewritten to overcome this restriction ?
>
> *---------------------*
> *Muhammad Gelbana*
> http://www.linkedin.com/in/mgelbana
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message