hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruilong Huo <r...@pivotal.io>
Subject Re: selective partition scanning may not work
Date Tue, 19 Jan 2016 06:20:47 GMT
>From the plan, there is no DPE (dynamic partition elimination) involved.
That's why full table (instead of target partition) is scanned.

Would you please attach the full output of the query plan including
settings and optimizer status as below?

test=# explain xxx
                                     QUERY PLAN
------------------------------------------------------------------------------------
...
 Settings:  default_segment_num=8
 Optimizer status: PQO version 1.617
(6 rows)

Best regards,
Ruilong Huo

On Tue, Jan 19, 2016 at 12:36 PM, 查 锐 <zrdt713@gmail.com> wrote:

> hi all,
>
> I created a table with date range partition, and my creating table cause
> as follows:
>
> CREATE TABLE zhongxing_2015_obj_12months (RegionID integer, eNodeBID
> bigint, CellID integer, Time varchar(10), v1 integer, v2 integer, v3
> integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer, v9
> integer, v10 integer, v11 integer, v12 integer, v13 integer, v14 integer,
> v15 integer, v16 integer, v17 integer, v18 integer, v19 integer, v20
> integer, v21 integer, v22 integer, v23 integer, v24 integer, v25 integer,
> v26 integer, v27 integer, v28 integer, v29 integer, v30 integer, v31
> integer, v32 integer, v33 integer, v34 integer, v35 integer, v36 integer,
> v37 integer, v38 integer, v39 integer, v40 integer, v41 integer, v42
> integer, v43 integer, v44 integer, v45 integer, v46 integer, v47 integer,
> v48 integer, v49 integer, v50 integer, v51 integer, v52 integer, v53
> integer, v54 integer, v55 integer, v56 integer, v57 integer, v58 integer,
> v59 integer, v60 integer, v61 integer, v62 integer, v63 integer, v64
> integer, v65 integer, v66 integer, v67 integer, v68 integer, v69 integer,
> v70 integer, v71 integer, v72 integer, v73 integer, v74 integer, v75
> integer, v76 integer, v77 integer, v78 integer, v79 integer, v80 integer,
> v81 integer, v82 integer, v83 integer, v84 integer, v85 integer, v86
> integer, v87 integer, v88 integer, v89 integer, v90 integer, v91 integer,
> v92 integer, v93 integer, v94 integer, v95 integer, v96 integer, v97
> integer, v98 integer, v99 integer, v100 integer, v101 integer, v102
> integer, v103 integer, v104 integer, v105 integer, v106 integer, v107
> integer, v108 integer, v109 integer, v110 integer, v111 integer, v112
> integer, v113 integer, v114 integer, v115 integer, v116 integer, v117
> integer, v118 integer, v119 integer, v120 integer, v121 integer, v122
> integer, v123 integer, v124 integer, v125 integer, v126 integer, v127
> integer, v128 integer, v129 integer, v130 integer, v131 integer, v132
> integer, v133 integer, v134 integer, v135 integer, v136 integer, v137
> integer, v138 integer, v139 integer, v140 integer, v141 integer, v142
> integer, v143 integer, v144 integer, v145 integer, v146 integer, v147
> integer, v148 integer, v149 integer, v150 integer, v151 integer, v152
> integer, v153 integer, v154 integer, v155 integer, v156 integer, v157
> integer, v158 integer, v159 integer, v160 integer, v161 integer, v162
> integer, v163 integer, v164 integer, v165 integer, v166 integer, v167
> integer, v168 integer, v169 integer, v170 integer, v171 integer, v172
> integer, v173 integer, v174 integer, v175 integer, day date) WITH
> (appendonly=true, orientation=parquet)
>   DISTRIBUTED BY (Time)
>   PARTITION BY RANGE(day)
> ( PARTITION JAN START (date '2015-01-01') INCLUSIVE ,
> PARTITION FEB START (date '2015-02-01') INCLUSIVE ,
> PARTITION MAR START (date '2015-03-01') INCLUSIVE ,
> PARTITION APR START (date '2015-04-01') INCLUSIVE ,
> PARTITION MAY START (date '2015-05-01') INCLUSIVE ,
> PARTITION JUN START (date '2015-06-01') INCLUSIVE ,
> PARTITION JUL START (date '2015-07-01') INCLUSIVE ,
> PARTITION AUG START (date '2015-08-01') INCLUSIVE ,
> PARTITION SEP START (date '2015-09-01') INCLUSIVE ,
> PARTITION OCT START (date '2015-10-01') INCLUSIVE ,
> PARTITION NOV START (date '2015-11-01') INCLUSIVE ,
> PARTITION DEC START (date '2015-12-01') INCLUSIVE
>                   END (date '2016-01-01') EXCLUSIVE );
>
> i.e the table has 12 partitions, and 1 month data in each partition.
>
> And I INSERT the table with the sql
>
> INSERT INTO zhongxing_2015_obj_12months SELECT  RegionID, eNodeBID,
> CellID, Time, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14,
> v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27, v28, v29,
> v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40, v41, v42, v43, v44,
> v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55, v56, v57, v58, v59,
> v60, v61, v62, v63, v64, v65, v66, v67, v68, v69, v70, v71, v72, v73, v74,
> v75, v76, v77, v78, v79, v80, v81, v82, v83, v84, v85, v86, v87, v88, v89,
> v90, v91, v92, v93, v94, v95, v96, v97, v98, v99, v100, v101, v102, v103,
> v104, v105, v106, v107, v108, v109, v110, v111, v112, v113, v114, v115,
> v116, v117, v118, v119, v120, v121, v122, v123, v124, v125, v126, v127,
> v128, v129, v130, v131, v132, v133, v134, v135, v136, v137, v138, v139,
> v140, v141, v142, v143, v144, v145, v146, v147, v148, v149, v150, v151,
> v152, v153, v154, v155, v156, v157, v158, v159, v160, v161, v162, v163,
> v164, v165, v166, v167, v168, v169, v170, v171, v172, v173, v174, v175,
> to_date(substring(Time from 1 for 6),'yyMMdd') AS day FROM zhongxing_2015;
>
> but when I query one partition with the sql
>
> select
> sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1,
> sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2,
> sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3,
> sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4,
> sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5,
> sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6,
> sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7,
> sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8,
> sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9,
> sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10,
> sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11,
> sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12,
> sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13,
> sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14,
> sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15,
> sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16,
> sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17,
> sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18,
> sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19,
> sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20,
> sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21,
> sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22,
> sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23,
> sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24,
> sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25,
> sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26,
> sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27,
> sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28,
> sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29,
> sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30,
> sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31,
> sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32,
> sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33,
> sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34,
> sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35,
> substr(time,1,4) as month,
> eNodeBID,
> CellID
> from zhongxing_2015_obj_12months
> where (day >= '2015-03-01' and day <= '2015-03-31')
> and (Time between '1503010000' and '1503312345')
> and RegionID=999
> group by eNodeBID,CellID,month
> order by P1 DESC limit 50;
>
> I found the performance is not good, and when I substitute the table name
> ‘zhongxing_2015_obj_12months’ with the sub partition table name ‘
> zhongxing_2015_obj_12months_1_prt_mar’, I got almost 1x performance
> improvement.
>
> I think for the partitioned table, this two sql will be parsed to the same
> thing. So I wonder why this situation could be happened. I run the explain
> query, and the result as follows:
>
>
>                   QUERY PLAN
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=1725101.68..1725102.80 rows=50 width=324)
>    ->  Gather Motion 192:1  (slice2; segments: 192)
> (cost=1725101.68..1725102.80 rows=50 width=324)
>          Merge Key: p1
>          ->  Limit  (cost=1725101.68..1725101.80 rows=1 width=324)
>                ->  Sort  (cost=1725101.68..1725113.19 rows=24 width=324)
>                      Sort Key (Limit): p1
>                      ->  HashAggregate  (cost=1721149.86..1724821.55
> rows=24 width=324)
>                            Group By:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid, "?column3?"
>                            ->  Redistribute Motion 192:192  (slice1;
> segments: 192)  (cost=1718939.94..1719055.04 rows=24 width=1444)
>                                  Hash Key:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
>                                  ->  HashAggregate
> (cost=1718939.94..1718962.96 rows=24 width=1444)
>                                        Group By:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid,
> substr(public.zhongxing_2015_obj_12months.day::text, 1, 7)
>                                        ->  Result  (cost=0.00..1698454.59
> rows=240 width=716)
>                                              ->  Append
> (cost=0.00..1698224.41 rows=240 width=716)
>                                                    ->  Parquet table Scan
> on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months
> (cost=0.00..1698224.41 rows=240 width=716)
>                                                          Filter: day =
> '2015-03-01 00:00:00'::timestamp without time zone
>
> I can see the “Parquet table Scan on zhongxing_2015_obj12months_1_prt_mar
> zhongxing_2015_obj_12months” part, dose this mean scanner scanning ALL DATA
>  of all partitions in table zhongxing_2015_obj_12months plus one sub table
> zhongxing_2015_obj_12months_1_prt_mar include one partition of 12?
>
> I will waiting for the answer online, this problem is very critical and
> urgent for me!
>
> Thank you so mush!!!
>
> Best Regards
> Rui Zha
>
>
>

Mime
View raw message