hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruilong Huo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-884) Subquery scan return no tuple in query with CTE
Date Thu, 30 Jun 2016 08:23:10 GMT
Ruilong Huo created HAWQ-884:
--------------------------------

             Summary: Subquery scan return no tuple in query with CTE
                 Key: HAWQ-884
                 URL: https://issues.apache.org/jira/browse/HAWQ-884
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Query Execution
            Reporter: Ruilong Huo
            Assignee: Lei Chang


Here is the CTE query that return no tuple, while it should return 1 tuple actually.
{noformat}
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),                            
                                                                                         
                                               t2 AS ( SELECT 3 c3 )                     
                                                                                         
                                                                                         
   SELECT * FROM t1                                                                      
                                                                                         
                                                      WHERE EXISTS (SELECT * FROM t2 WHERE
c3=c1);                                                                                  
                                                                                         
               

-- Actual
c1 | c2
----+----
(0 rows)

-- Expected
 c1 | c2
----+----
  3 |  4
(1 row)
{noformat}

The root cause is that during query planning, it is correct that t2 in the CTE (common table
expression) clause is treated as subquery scan and then materialized. However, during query
execution, it generate no tuple when t2 is evaluated. Thus, the join of t1 with t2 generate
no tuple. We can see this in query execution statistics using explain analyze while running
the query with optimizer = off.
{noformat}
   ->  Materialize  (cost=0.00..0.01 rows=1 width=0)
         Rows out:  0 rows with 0.167 ms to end of 3 scans, start offset by 0.265 ms.
         ->  Limit  (cost=0.00..0.00 rows=1 width=0)
               Rows out:  0 rows with 0.003 ms to end, start offset by 0.257 ms.
               ->  Subquery Scan t2  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  0 rows with 0.002 ms to end, start offset by 0.258 ms.
                     ->  Result  (cost=0.00..0.01 rows=1 width=0)
                           One-Time Filter: 3 = $0
                           Rows out:  0 rows with 0.001 ms to end, start offset by 0.258 ms.
{noformat}

Here is details:
1) hawq 2.0 with optimizer off (planner): subquery scan generate no tuple
{noformat}
show optimizer;
 optimizer
-----------
 off
(1 row)

WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
 c1 | c2
----+----
(0 rows)

EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
                                                                            QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.05..0.29 rows=72 width=8)
   Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.237/0.237 ms to end.
   ->  Limit  (cost=0.00..0.00 rows=1 width=0)
         Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.003/0.003 ms to
end.
         ->  Subquery Scan t2  (cost=0.00..0.01 rows=6 width=0)
               Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.002/0.002
ms to end.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     One-Time Filter: 3 = $0
                     Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0/0 ms
to end.
   ->  Materialize  (cost=0.05..0.17 rows=12 width=8)
         Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0.129/0.129 ms to
end, start offset by 0.135/0.135 ms.
         ->  Append  (cost=0.00..0.04 rows=2 width=0)
               Rows out:  Avg 2.0 rows x 1 workers.  Max/Last(/) 2/2 rows with 0.002/0.002
ms to first row, 0.004/0.004 ms to end, start offset by 0.255/0.255 ms.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0.002/0.002
ms to end, start offset by 0.255/0.255 ms.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0/0 ms
to end, start offset by 0.261/0.261 ms.
 Slice statistics:
   (slice0)    Executor memory: 61K bytes.
 Statement statistics:
   Memory used: 128000K bytes
 Settings:  default_hash_table_bucket_number=6; optimizer=off
 Optimizer status: legacy query optimizer
 Data locality statistics:
   no data locality information in this query
 Total runtime: 0.372 ms
(26 rows)
{noformat}
2) hawq 2.0 with optimizer on (orca): fallback to planner and thus subquery scan generate
no tuple
{noformat}
show optimizer;
 optimizer
-----------
 on
(1 row)

WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
 c1 | c2
----+----
(0 rows)

EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
                                                                            QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.05..0.29 rows=72 width=8)
   Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.273/0.273 ms to end.
   ->  Limit  (cost=0.00..0.00 rows=1 width=0)
         Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.003/0.003 ms to
end.
         ->  Subquery Scan t2  (cost=0.00..0.01 rows=6 width=0)
               Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.002/0.002
ms to end.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     One-Time Filter: 3 = $0
                     Rows out:  Avg 0.0 rows x 0 workers.  Max/Last(/) 0/0 rows with 0.001/0.001
ms to end.
   ->  Materialize  (cost=0.05..0.17 rows=12 width=8)
         Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0.151/0.151 ms to
end, start offset by 0.189/0.189 ms.
         ->  Append  (cost=0.00..0.04 rows=2 width=0)
               Rows out:  Avg 2.0 rows x 1 workers.  Max/Last(/) 2/2 rows with 0.003/0.003
ms to first row, 0.004/0.004 ms to end, start offset by 0.327/0.327 ms.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0.002/0.002
ms to end, start offset by 0.327/0.327 ms.
               ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(/) 1/1 rows with 0/0 ms
to end, start offset by 0.337/0.337 ms.
 Slice statistics:
   (slice0)    Executor memory: 61K bytes.
 Statement statistics:
   Memory used: 128000K bytes
 Settings:  default_hash_table_bucket_number=6
 Optimizer status: legacy query optimizer
 Data locality statistics:
   no data locality information in this query
 Total runtime: 0.468 ms
(26 rows)
{noformat}
3) hawq 1.x with optimizer off (planner): subquery scan generate no tuple
{noformat}
show optimizer;
 optimizer
-----------
 off
(1 row)

WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
 c1 | c2
----+----
(0 rows)

EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..0.08 rows=4 width=8)
   Rows out:  0 rows with 0.220 ms to end, start offset by 0.093 ms.
   ->  Append  (cost=0.00..0.04 rows=2 width=0)
         Rows out:  2 rows with 0.001 ms to first row, 0.003 ms to end, start offset by 0.262
ms.
         ->  Result  (cost=0.00..0.01 rows=1 width=0)
               Rows out:  1 rows with 0.001 ms to end, start offset by 0.262 ms.
         ->  Result  (cost=0.00..0.01 rows=1 width=0)
               Rows out:  1 rows with 0.001 ms to end, start offset by 0.264 ms.
   ->  Materialize  (cost=0.00..0.01 rows=1 width=0)
         Rows out:  0 rows with 0.167 ms to end of 3 scans, start offset by 0.265 ms.
         ->  Limit  (cost=0.00..0.00 rows=1 width=0)
               Rows out:  0 rows with 0.003 ms to end, start offset by 0.257 ms.
               ->  Subquery Scan t2  (cost=0.00..0.01 rows=1 width=0)
                     Rows out:  0 rows with 0.002 ms to end, start offset by 0.258 ms.
                     ->  Result  (cost=0.00..0.01 rows=1 width=0)
                           One-Time Filter: 3 = $0
                           Rows out:  0 rows with 0.001 ms to end, start offset by 0.258 ms.
 Slice statistics:
   (slice0)    Executor memory: 61K bytes.
 Statement statistics:
   Memory used: 128000K bytes
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
 Total runtime: 0.315 ms
(24 rows)
{noformat}
4) hawq 1.x with optimizer on (orca): work as expected
{noformat}
show optimizer;
 optimizer
-----------
 on
(1 row)

WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
 c1 | c2
----+----
  3 |  4
(1 row)

EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
     t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Hash EXISTS Join  (cost=0.00..0.00 rows=2 width=8)
   Hash Cond: "outer".c1 = "inner".c3
   Rows out:  1 rows with 0.835 ms to first row, 2.373 ms to end, start offset by 0.194 ms.
   Executor memory:  1K bytes.
   Work_mem used:  1K bytes. Workfile: (0 spilling, 0 reused)
   Hash chain length 1.0 avg, 1 max, using 1 of 524341 buckets.
   ->  Append  (cost=0.00..0.00 rows=2 width=8)
         Rows out:  2 rows with 0.002 ms to first row, 0.004 ms to end, start offset by 1.023
ms.
         ->  Result  (cost=0.00..0.00 rows=1 width=8)
               Rows out:  1 rows with 0.001 ms to first row, 0.002 ms to end, start offset
by 1.023 ms.
               ->  Result  (cost=0.00..0.00 rows=1 width=1)
                     Rows out:  1 rows with 0 ms to end, start offset by 1.024 ms.
         ->  Result  (cost=0.00..0.00 rows=1 width=8)
               Rows out:  1 rows with 0.001 ms to end, start offset by 1.026 ms.
               ->  Result  (cost=0.00..0.00 rows=1 width=1)
                     Rows out:  1 rows with 0 ms to end, start offset by 1.026 ms.
   ->  Hash  (cost=0.00..0.00 rows=1 width=4)
         Rows in:  1 rows with 0.010 ms to end, start offset by 1.013 ms.
         ->  Result  (cost=0.00..0.00 rows=1 width=4)
               Rows out:  1 rows with 0.006 ms to first row, 0.007 ms to end, start offset
by 1.013 ms.
               ->  Result  (cost=0.00..0.00 rows=1 width=1)
                     Rows out:  1 rows with 0.001 ms to end, start offset by 1.014 ms.
 Slice statistics:
   (slice0)    Executor memory: 8270K bytes.  Work_mem: 1K bytes max.
 Statement statistics:
   Memory used: 128000K bytes
 Optimizer status: PQO version 1.591
 Total runtime: 2.572 ms
(28 rows)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message