drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <amansi...@apache.org>
Subject Re: Drill query planning error
Date Wed, 26 Jul 2017 16:20:15 GMT
[Since this is Drill specific, I put dev@calcite on BCC].

If you have two aggregates: Count(distinct a), Count(distinct b), the
Calcite logical plan consists of a cartesian join of 2 subqueries each of
which first does a group-by on the distinct column followed by a count
aggregate.   By default,  Drill only processes cartesian join if one input
of the join is known to be scalar (single row).  It sounds like after you
did the transformation to use the cache, that scalar property somehow did
not get propagated.
You can override this behavior by a session configuration:  (this will use
a nested loop join even if inputs are not provably scalar, but it should be
used for specific query only).    For a more general solution, I believe
you may have to create an enhancement JIRA with appropriate details.
   'alter session set planner.enable_nljoin_for_scalar_only = false';

On Wed, Jul 26, 2017 at 4:14 AM, weijie tong <tongweijie178@gmail.com>

> HI all:
>   I materialize the count distinct query result to a cache, then when user
> query the count distinct , a specific rule will translate the query to the
> cache. It turns out right when the query has only one count (distinct )
> operator ,but when it has two count (distinct ) ,it causes error .The error
> info is here:
> https://gist.github.com/weijietong/1b8ed12db9490bf006e8b3fe0ee52269
> Best Regards.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message