hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陶征霖 <ztao1...@apache.org>
Subject Re: why these 2 queries had different explain
Date Mon, 22 May 2017 13:55:50 GMT
Hi tony,

Yes, confirmed it's a bug. Filed the bug in apache hawq jira
https://issues.apache.org/jira/browse/HAWQ-1470. Will fix it later.

Thanks,
Zhenglin

2017-05-22 10:31 GMT+08:00 tao tony <tonytao0505@outlook.com>:

> OK.this bug appears when using gpfdist external table in select list.I
> create some test data to reproduce this bug in hawq docker.
>
> 1.create a external table
>
> CREATE EXTERNAL TABLE testext (
>      a int,
>      b character varying(255)
> ) LOCATION (
>      'gpfdist://172.19.0.2:8087/test.csv'
> ) FORMAT 'text' (delimiter E',' null E'' escape E'OFF');
>
> test.csv file contains:
>
> cat gpdata/test.csv
> 1,abc
> 2,bce
> 3,ced
>
> 2.create a internal table:
>
> create table test1(c int);
>
> insert into test1 values(1);
>
> insert into test1 values(2);
>
> insert into test1 values(3);
>
> insert into test1 values(4);
>
> 3.run query,could not get testext.b values:
>
> select c,(select s.b from testext s where t.c=s.a) from test1 t;
>   c | ?column?
> ---+----------
>   1 |
>   2 |
>   3 |
>   4 |
> (4 rows)
> 4.create an internal table from testext,and run the query again,ti
> returns the correct result
>
> create table test as select * from testext;
>
> select c,(select s.b from test s where t.c=s.a) from test1 t;
>   c | ?column?
> ---+----------
>   1 | abc
>   2 | bce
>   3 | ced
>   4 |
> (4 rows)
>
> you could compare the 2  explains,in step 3 ,testext was not broadcasted.
>
>
>
> On 05/22/2017 09:25 AM, 陶征霖 wrote:
> > Hi tony,
> >
> > Could you please provide the simple reproduce steps so that we can easily
> > debug in our own env.?
> >
> > Thanks,
> > Zhenglin
> >
> > 2017-05-18 14:17 GMT+08:00 tao tony <tonytao0505@outlook.com>:
> >
> >> it seems a bug for querying on an external table because I found
> >> ucloud_pay_tenanid ulist was a external table which using gpfdist.
> >>
> >> CREATE EXTERNAL TABLE ucloud_pay_tenanid (
> >>       customercode character varying(255),
> >>       customername character varying(255),
> >>       prefixflag character varying(255),
> >>       customertype character varying(255),
> >>       comments character varying(255)
> >> ) LOCATION (
> >> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv'
> >> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF')
> >> ENCODING 'UTF8';
> >>
> >> then I create an internal table using:
> >>
> >> create table ucloudtest as select * from ucloud_pay_tenanid;
> >>
> >> run  the explain,it lookks like the right query plan:
> >>
> >> hdb=# explain select cp.tenantid,
> >> hdb-#  (select ulist.customertype  from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>                                                  QUERY PLAN
> >> ------------------------------------------------------------
> >> ---------------------------------------------
> >>    Gather Motion 1:1  (slice2; segments: 1)  (cost=0.00..219.50 rows=100
> >> width=42)
> >>      ->  Append-only Scan on cptest cp  (cost=0.00..219.50 rows=100
> >> width=42)
> >>            SubPlan 1
> >>              ->  Result  (cost=2.18..2.19 rows=1 width=7)
> >>                    Filter: ulist.customercode::text = $0::text
> >>                    ->  Materialize  (cost=2.18..2.19 rows=1 width=7)
> >>                          ->  Broadcast Motion 1:1  (slice1; segments: 1)
> >> (cost=0.00..2.17 rows=1 width=7)
> >>                                ->  Append-only Scan on ucloudtest ulist
> >> (cost=0.00..2.17 rows=1 width=7)
> >>    Settings:  default_hash_table_bucket_number=18; optimizer=off
> >>    Optimizer status: legacy query optimizer
> >> (10 rows)
> >>
> >> run the query,get the correct result:
> >>
> >> hdb=# select cp.tenantid,
> >>    (select ulist.customertype  from ucloudtest ulist where
> >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>    tenantid | ?column? |            orderuuid
> >> ----------+----------+----------------------------------
> >>    sxve7r6c | 便利     | e6d9b57a0c55484392448ea908c1fe49
> >>    sxve7r6c | 便利     | 22a80697bfc74d63b7f28eee246c4368
> >>    3e7rph46 | 专卖     | 420ad3e45762459e91860b975e9f2751
> >>    3e7rph46 | 专卖     | 0634e7e3539a4116b9917a7493838f51
> >>    7jvfka5m | 专卖     | a7b96194fe9f48379e2711ac6000191b
> >>    6xydfh4y | 便利     | 7a55e97119784623a53f6e65ef9680c7
> >>    sxve7r6c | 便利     | 227f3d22aec14723bb51efc4e2a6f0b4
> >>    3e7rph46 | 专卖     | f3d02cc77a2348829be2f72ce24bf846
> >>    6xydfh4y | 便利     | bab722ac7d5748408d3ad2973d292ab5
> >>
> >>
> >> On 05/18/2017 11:11 AM, tao tony wrote:
> >>> hi guys,
> >>>
> >>> The different explains as below make me confused these days,could you
> >>> please help me to explain why ulist.customertype is null.
> >>>
> >>> explain :
> >>>
> >>> hdb=# explain select cp.tenantid,
> >>>     (select ulist.customertype  from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>>                                                QUERY PLAN
> >>> ------------------------------------------------------------
> >> --------------------------------------
> >>>     Gather Motion 18:1  (slice1; segments: 18) (cost=0.00..1350002.00
> >>> rows=100 width=42)
> >>>       ->  Append-only Scan on cptest cp  (cost=0.00..1350002.00 rows=6
> >>> width=42)
> >>>             SubPlan 1
> >>>               ->  External Scan on ucloud_pay_tenanid ulist
> >>> (cost=0.00..13500.00 rows=56 width=516)
> >>>                     Filter: customercode::text = $0::text
> >>>     Settings:  default_hash_table_bucket_number=18; optimizer=off
> >>>     Optimizer status: legacy query optimizer
> >>> (7 rows)
> >>>
> >>>
> >>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2
> >> t;
> >>>                                                   QUERY PLAN
> >>> ------------------------------------------------------------
> >> ---------------------------------------------
> >>>     Gather Motion 1:1  (slice2; segments: 1)  (cost=0.00..10.20 rows=9
> >>> width=6)
> >>>       ->  Append-only Scan on test2 t  (cost=0.00..10.20 rows=9
> width=6)
> >>>             SubPlan 1
> >>>               ->  Result  (cost=1.01..1.02 rows=1 width=4)
> >>>                     Filter: $0::text = s.e::text
> >>>                     ->  Materialize  (cost=1.01..1.02 rows=1 width=4)
> >>>                           ->  Broadcast Motion 1:1  (slice1; segments:
> 1)
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>>                                 ->  Append-only Scan on test1 s
> >>> (cost=0.00..1.01 rows=1 width=4)
> >>>     Settings:  default_hash_table_bucket_number=18; optimizer=off
> >>>     Optimizer status: legacy query optimizer
> >>> (10 rows)
> >>>
> >>> test data:
> >>>
> >>> query1:
> >>>
> >>> hdb-#  (select ulist.customertype  from ucloud_pay_tenanid ulist where
> >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp;
> >>>     tenantid | ?column? |            orderuuid
> >>> ----------+----------+----------------------------------
> >>>     sxve7r6c |          | 901cf777a3144907899226de86ab8cfe
> >>>     k5r9qcfj |          | 54c8cbdbe64f4dcca5cacb40b7ab52d6
> >>>     3e7rph46 |          | f5e777a5189e409da607182d059ec0d5
> >>>     3e7rph46 |          | 478d8a0921ab444ca80dc03ce97d3a94
> >>>     3e7rph46 |          | 25808c4a66e34ecaad02d2fc183920ca
> >>>     3e7rph46 |          | 0a62832067ea4db68641cce35385aae7
> >>>     3e7rph46 |          | 4c9e91233af947518c5c5767a2bd8b3c
> >>>     6xydfh4y |          | b2fa1af123384452b23bbc5e0794da56
> >>>     6xydfh4y |          | ade5731d97044a0eb9d74836ea7dbf48
> >>>     pejg93wh |          | da2d037765bb4262a82764715cb4453d
> >>>     sxve7r6c |          | b02df6c998ff4e448b575e3b1fae8e35
> >>>     3e7rph46 |          | 4c9e91233af947518c5c5767a2bd8b3c
> >>>
> >>> query 2:
> >>>
> >>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t;
> >>>     a | ?column?
> >>> ---+----------
> >>>     1 |        1
> >>>     2 |        2
> >>>     3 |        3
> >>>     4 |        4
> >>>     5 |        5
> >>>     6 |        7
> >>>     7 |        8
> >>>     8 |        9
> >>>     9 |       10
> >>> (9 rows)
> >>>
> >>> another query equal to query1,customertype is not null:
> >>>
> >>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest
> >>> cp,ucloud_pay_tenanid ulist
> >>> hdb-# where  cp.tenantid = ulist.customercode;
> >>>     tenantid | customertype |            orderuuid
> >>> ----------+--------------+----------------------------------
> >>>     7jvfka5m | 专卖         | f10771c9eb4e434dadcc82dc7cda0d4d
> >>>     7jvfka5m | 专卖         | 246a13ed45174ee083fd9f743532ab4c
> >>>     7jvfka5m | 专卖         | e2687954d3e94ab0bf1366369bdc9887
> >>>     7jvfka5m | 专卖         | 018ef9e35db94901b4822509494bd6b5
> >>>     7jvfka5m | 专卖         | 9054d2d5d7264a8aa4b41baddf1198e8
> >>>     7jvfka5m | 专卖         | 9054d2d5d7264a8aa4b41baddf1198e8
> >>>     7jvfka5m | 专卖         | a7b96194fe9f48379e2711ac6000191b
> >>>     u642nk35 | 便利         | df4ef8599c154c978d021b6765edb55b
> >>>     u642nk35 | 便利         | d60c5d3ac0f44ef1a278d50d84847c2f
> >>>     u642nk35 | 便利         | df6ca3d8fb444f8fba604db8611c6292
> >>>     u642nk35 | 便利         | df4ef8599c154c978d021b6765edb55b
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message