hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruilong Huo <r...@pivotal.io>
Subject Re: selective partition scanning may not work
Date Tue, 19 Jan 2016 06:30:30 GMT
>From Filter: day = '2015-03-01 00:00:00'::timestamp without time zone , we
can see '2015-03-01' is auto casted to timestamp without time zone. So, it
is not a "string" any more.

Best regards,
Ruilong Huo

On Tue, Jan 19, 2016 at 2:24 PM, 陶进 <tonytao0505@outlook.com> wrote:

> it was because the item day was compare to a string '2016-03-01' while
> day  type is data?
> I guessed!
>
>
> 在 2016/1/19 12:36, 查 锐 写道:
>
> hi all,
>
> I created a table with date range partition, and my creating table cause
> as follows:
>
> CREATE TABLE zhongxing_2015_obj_12months (RegionID integer, eNodeBID
> bigint, CellID integer, Time varchar(10), v1 integer, v2 integer, v3
> integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer, v9
> integer, v10 integer, v11 integer, v12 integer, v13 integer, v14 integer,
> v15 integer, v16 integer, v17 integer, v18 integer, v19 integer, v20
> integer, v21 integer, v22 integer, v23 integer, v24 integer, v25 integer,
> v26 integer, v27 integer, v28 integer, v29 integer, v30 integer, v31
> integer, v32 integer, v33 integer, v34 integer, v35 integer, v36 integer,
> v37 integer, v38 integer, v39 integer, v40 integer, v41 integer, v42
> integer, v43 integer, v44 integer, v45 integer, v46 integer, v47 integer,
> v48 integer, v49 integer, v50 integer, v51 integer, v52 integer, v53
> integer, v54 integer, v55 integer, v56 integer, v57 integer, v58 integer,
> v59 integer, v60 integer, v61 integer, v62 integer, v63 integer, v64
> integer, v65 integer, v66 integer, v67 integer, v68 integer, v69 integer,
> v70 integer, v71 integer, v72 integer, v73 integer, v74 integer, v75
> integer, v76 integer, v77 integer, v78 integer, v79 integer, v80 integer,
> v81 integer, v82 integer, v83 integer, v84 integer, v85 integer, v86
> integer, v87 integer, v88 integer, v89 integer, v90 integer, v91 integer,
> v92 integer, v93 integer, v94 integer, v95 integer, v96 integer, v97
> integer, v98 integer, v99 integer, v100 integer, v101 integer, v102
> integer, v103 integer, v104 integer, v105 integer, v106 integer, v107
> integer, v108 integer, v109 integer, v110 integer, v111 integer, v112
> integer, v113 integer, v114 integer, v115 integer, v116 integer, v117
> integer, v118 integer, v119 integer, v120 integer, v121 integer, v122
> integer, v123 integer, v124 integer, v125 integer, v126 integer, v127
> integer, v128 integer, v129 integer, v130 integer, v131 integer, v132
> integer, v133 integer, v134 integer, v135 integer, v136 integer, v137
> integer, v138 integer, v139 integer, v140 integer, v141 integer, v142
> integer, v143 integer, v144 integer, v145 integer, v146 integer, v147
> integer, v148 integer, v149 integer, v150 integer, v151 integer, v152
> integer, v153 integer, v154 integer, v155 integer, v156 integer, v157
> integer, v158 integer, v159 integer, v160 integer, v161 integer, v162
> integer, v163 integer, v164 integer, v165 integer, v166 integer, v167
> integer, v168 integer, v169 integer, v170 integer, v171 integer, v172
> integer, v173 integer, v174 integer, v175 integer, day date) WITH
> (appendonly=true, orientation=parquet)
>   DISTRIBUTED BY (Time)
>   PARTITION BY RANGE(day)
> ( PARTITION JAN START (date '2015-01-01') INCLUSIVE ,
> PARTITION FEB START (date '2015-02-01') INCLUSIVE ,
> PARTITION MAR START (date '2015-03-01') INCLUSIVE ,
> PARTITION APR START (date '2015-04-01') INCLUSIVE ,
> PARTITION MAY START (date '2015-05-01') INCLUSIVE ,
> PARTITION JUN START (date '2015-06-01') INCLUSIVE ,
> PARTITION JUL START (date '2015-07-01') INCLUSIVE ,
> PARTITION AUG START (date '2015-08-01') INCLUSIVE ,
> PARTITION SEP START (date '2015-09-01') INCLUSIVE ,
> PARTITION OCT START (date '2015-10-01') INCLUSIVE ,
> PARTITION NOV START (date '2015-11-01') INCLUSIVE ,
> PARTITION DEC START (date '2015-12-01') INCLUSIVE
>                   END (date '2016-01-01') EXCLUSIVE );
>
> i.e the table has 12 partitions, and 1 month data in each partition.
>
> And I INSERT the table with the sql
>
> INSERT INTO zhongxing_2015_obj_12months SELECT  RegionID, eNodeBID,
> CellID, Time, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14,
> v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27, v28, v29,
> v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40, v41, v42, v43, v44,
> v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55, v56, v57, v58, v59,
> v60, v61, v62, v63, v64, v65, v66, v67, v68, v69, v70, v71, v72, v73, v74,
> v75, v76, v77, v78, v79, v80, v81, v82, v83, v84, v85, v86, v87, v88, v89,
> v90, v91, v92, v93, v94, v95, v96, v97, v98, v99, v100, v101, v102, v103,
> v104, v105, v106, v107, v108, v109, v110, v111, v112, v113, v114, v115,
> v116, v117, v118, v119, v120, v121, v122, v123, v124, v125, v126, v127,
> v128, v129, v130, v131, v132, v133, v134, v135, v136, v137, v138, v139,
> v140, v141, v142, v143, v144, v145, v146, v147, v148, v149, v150, v151,
> v152, v153, v154, v155, v156, v157, v158, v159, v160, v161, v162, v163,
> v164, v165, v166, v167, v168, v169, v170, v171, v172, v173, v174, v175,
> to_date(substring(Time from 1 for 6),'yyMMdd') AS day FROM zhongxing_2015;
>
> but when I query one partition with the sql
>
> select
> sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1,
> sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2,
> sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3,
> sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4,
> sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5,
> sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6,
> sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7,
> sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8,
> sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9,
> sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10,
> sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11,
> sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12,
> sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13,
> sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14,
> sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15,
> sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16,
> sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17,
> sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18,
> sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19,
> sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20,
> sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21,
> sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22,
> sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23,
> sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24,
> sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25,
> sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26,
> sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27,
> sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28,
> sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29,
> sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30,
> sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31,
> sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32,
> sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33,
> sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34,
> sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35,
> substr(time,1,4) as month,
> eNodeBID,
> CellID
> from zhongxing_2015_obj_12months
> where (day >= '2015-03-01' and day <= '2015-03-31')
> and (Time between '1503010000' and '1503312345')
> and RegionID=999
> group by eNodeBID,CellID,month
> order by P1 DESC limit 50;
>
> I found the performance is not good, and when I substitute the table name
> ‘zhongxing_2015_obj_12months’ with the sub partition table name ‘
> zhongxing_2015_obj_12months_1_prt_mar’, I got almost 1x performance
> improvement.
>
> I think for the partitioned table, this two sql will be parsed to the same
> thing. So I wonder why this situation could be happened. I run the explain
> query, and the result as follows:
>
>
>                   QUERY PLAN
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=1725101.68..1725102.80 rows=50 width=324)
>    ->  Gather Motion 192:1  (slice2; segments: 192)
> (cost=1725101.68..1725102.80 rows=50 width=324)
>          Merge Key: p1
>          ->  Limit  (cost=1725101.68..1725101.80 rows=1 width=324)
>                ->  Sort  (cost=1725101.68..1725113.19 rows=24 width=324)
>                      Sort Key (Limit): p1
>                      ->  HashAggregate  (cost=1721149.86..1724821.55
> rows=24 width=324)
>                            Group By:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid, "?column3?"
>                            ->  Redistribute Motion 192:192  (slice1;
> segments: 192)  (cost=1718939.94..1719055.04 rows=24 width=1444)
>                                  Hash Key:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
>                                  ->  HashAggregate
> (cost=1718939.94..1718962.96 rows=24 width=1444)
>                                        Group By:
> public.zhongxing_2015_obj_12months.enodebid,
> public.zhongxing_2015_obj_12months.cellid,
> substr(public.zhongxing_2015_obj_12months.day::text, 1, 7)
>                                        ->  Result  (cost=0.00..1698454.59
> rows=240 width=716)
>                                              ->  Append
> (cost=0.00..1698224.41 rows=240 width=716)
>                                                    ->  Parquet table Scan
> on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months
> (cost=0.00..1698224.41 rows=240 width=716)
>                                                          Filter: day =
> '2015-03-01 00:00:00'::timestamp without time zone
>
> I can see the “Parquet table Scan on zhongxing_2015_obj12months_1_prt_mar
> zhongxing_2015_obj_12months” part, dose this mean scanner scanning ALL DATA
>  of all partitions in table zhongxing_2015_obj_12months plus one sub table
> zhongxing_2015_obj_12months_1_prt_mar include one partition of 12?
>
> I will waiting for the answer online, this problem is very critical and
> urgent for me!
>
> Thank you so mush!!!
>
> Best Regards
> Rui Zha
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
受
> Avast 保护的无病毒计算机已发送该电子邮件。
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>

Mime
View raw message