superset-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-superset] villebro edited a comment on issue #8183: How to pass time filters to SQL Lab queries
Date Wed, 18 Sep 2019 21:33:21 GMT
villebro edited a comment on issue #8183: How to pass time filters to SQL Lab queries
URL: https://github.com/apache/incubator-superset/issues/8183#issuecomment-532875347
 
 
   Ok, so I was thinking the following. Say you have a join as follows:
   ```sql
   SELECT fo.dte,
          fc.id,
          fc.name,
          fo.product
   FROM fact_customer fc
   INNER JOIN fact_orders fo ON fc.id = fo.customer_id
   ```
   normally Superset would push this into a subquery and perform filtering/grouping/etc on
that as follows:
   ```sql
   SELECT dte,
          count(*) AS "count(*)"
   FROM
     (SELECT fo.dte,
             fc.id,
             fc.name,
             fo.product
      FROM fact_customer fc
      INNER JOIN fact_orders fo ON fc.id = fo.customer_id) AS expr_qry
   WHERE dte >= '2019-01-01'
     AND dte <= '2019-12-31'
   GROUP BY dte
   ```
   Some engines will not perform predicate pushdown into the subquery, resulting in unnecessarily
expensive queries when materializing the subquery. To get around this I would propose wrapping
the original query in a CTE:
   ```sql
   WITH cte_qry AS
     (SELECT fo.dte,
             fc.id,
             fc.name,
             fo.product
      FROM fact_customer fc
      INNER JOIN fact_orders fo ON fc.id = fo.customer_id)
   SELECT dte,
          count(*) AS "count(*)"
   FROM cte_qry
   GROUP BY dte
   WHERE dte >= '2019-01-01'
     AND dte <= '2019-12-31'
   ```
   Mind you, the majority of modern SQL engines treat subqueries and CTEs equally, i.e. will
not impose a performance penalty on the query, but this may not always be the case. Also,
some engines don't support wrapping CTEs in subqueries, causing any CTEs to fail when wrapped
in a subquery. Especially for CTEs it would be more preferable to wrap the final query of
the CTE in a CTE of it's own, and then let Superset build a query on that. Say, for example,
that you would write the following query in Sql Lab:
   ```sql
   WITH a AS
     (SELECT 1 AS a
      FROM tbl),
        b AS
     (SELECT 1 AS a,
             'a' AS b
      FROM tbl)
   SELECT a.a,
          b.b
   FROM a
   INNER JOIN b ON a.a = b.a
   ````
   This would then become
   ```sql
   WITH a AS
     (SELECT 1 AS a
      FROM tbl),
        b AS
     (SELECT 1 AS a,
             'a' AS b
      FROM tbl),
        cte_qry AS
     (SELECT a.a,
             b.b
      FROM a
      INNER JOIN b ON a.a = b.a)
   SELECT a,
          count(*) AS "count(*)"
   FROM cte_qry
   GROUP BY a
   ```
   when Superset constructs a query, as opposed to the current behaviour of wrapping the CTE
in a subquery.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Mime
View raw message