hawq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lin Wen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter
Date Thu, 24 May 2018 12:30:00 GMT

     [ https://issues.apache.org/jira/browse/HAWQ-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lin Wen updated HAWQ-1616:
--------------------------
    Description: 
Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key "l_partkey"
is not in select list:

select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where p_partkey = l_partkey
and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10; 

 select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = l_partkey and
p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;

The SQL statement and data are from TPCH workload, the correct result should be:
l_quantity | l_extendedprice
------------+-----------------
       3.00 |         5399.55
       6.00 |         8318.58
      38.00 |        57927.20
      49.00 |        90545.63
      44.00 |        76197.88
      10.00 |        17146.20
      26.00 |        34376.94
      35.00 |        56332.85
       9.00 |        11999.88
      14.00 |        24020.92
(10 rows)

The projection information hasn't been pushed down to parquet scan correctly, so current result
is none.

  was:
Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key "l_partkey"
is not in select list:

select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where p_partkey = l_partkey
and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10; 

 select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = l_partkey and
p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;


> Wrong Result of Hash Join When Enable Bloom filter
> --------------------------------------------------
>
>                 Key: HAWQ-1616
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1616
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Query Execution
>            Reporter: Lin Wen
>            Assignee: Lin Wen
>            Priority: Major
>
> Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key "l_partkey"
is not in select list:
> select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where p_partkey
= l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10; 
>  select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = l_partkey
and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;
> The SQL statement and data are from TPCH workload, the correct result should be:
> l_quantity | l_extendedprice
> ------------+-----------------
>        3.00 |         5399.55
>        6.00 |         8318.58
>       38.00 |        57927.20
>       49.00 |        90545.63
>       44.00 |        76197.88
>       10.00 |        17146.20
>       26.00 |        34376.94
>       35.00 |        56332.85
>        9.00 |        11999.88
>       14.00 |        24020.92
> (10 rows)
> The projection information hasn't been pushed down to parquet scan correctly, so current
result is none.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message