hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 查 锐 <zrdt...@gmail.com>
Subject Re: selective partition scanning may not work
Date Tue, 19 Jan 2016 06:43:55 GMT
Even transforming string to date type the result just all the same...

> 在 2016年1月19日,14:24,陶进 <tonytao0505@outlook.com> 写道:
> it was because the item day was compare to a string '2016-03-01' while day  type is data?
> I guessed!
> 在 2016/1/19 12:36, 查 锐 写道:
>> hi all,
>> I created a table with date range partition, and my creating table cause as follows:
>> CREATE TABLE zhongxing_2015_obj_12months (RegionID integer, eNodeBID bigint, CellID
integer, Time varchar(10), v1 integer, v2 integer, v3 integer, v4 integer, v5 integer, v6
integer, v7 integer, v8 integer, v9 integer, v10 integer, v11 integer, v12 integer, v13 integer,
v14 integer, v15 integer, v16 integer, v17 integer, v18 integer, v19 integer, v20 integer,
v21 integer, v22 integer, v23 integer, v24 integer, v25 integer, v26 integer, v27 integer,
v28 integer, v29 integer, v30 integer, v31 integer, v32 integer, v33 integer, v34 integer,
v35 integer, v36 integer, v37 integer, v38 integer, v39 integer, v40 integer, v41 integer,
v42 integer, v43 integer, v44 integer, v45 integer, v46 integer, v47 integer, v48 integer,
v49 integer, v50 integer, v51 integer, v52 integer, v53 integer, v54 integer, v55 integer,
v56 integer, v57 integer, v58 integer, v59 integer, v60 integer, v61 integer, v62 integer,
v63 integer, v64 integer, v65 integer, v66 integer, v67 integer, v68 integer, v69 integer,
v70 integer, v71 integer, v72 integer, v73 integer, v74 integer, v75 integer, v76 integer,
v77 integer, v78 integer, v79 integer, v80 integer, v81 integer, v82 integer, v83 integer,
v84 integer, v85 integer, v86 integer, v87 integer, v88 integer, v89 integer, v90 integer,
v91 integer, v92 integer, v93 integer, v94 integer, v95 integer, v96 integer, v97 integer,
v98 integer, v99 integer, v100 integer, v101 integer, v102 integer, v103 integer, v104 integer,
v105 integer, v106 integer, v107 integer, v108 integer, v109 integer, v110 integer, v111 integer,
v112 integer, v113 integer, v114 integer, v115 integer, v116 integer, v117 integer, v118 integer,
v119 integer, v120 integer, v121 integer, v122 integer, v123 integer, v124 integer, v125 integer,
v126 integer, v127 integer, v128 integer, v129 integer, v130 integer, v131 integer, v132 integer,
v133 integer, v134 integer, v135 integer, v136 integer, v137 integer, v138 integer, v139 integer,
v140 integer, v141 integer, v142 integer, v143 integer, v144 integer, v145 integer, v146 integer,
v147 integer, v148 integer, v149 integer, v150 integer, v151 integer, v152 integer, v153 integer,
v154 integer, v155 integer, v156 integer, v157 integer, v158 integer, v159 integer, v160 integer,
v161 integer, v162 integer, v163 integer, v164 integer, v165 integer, v166 integer, v167 integer,
v168 integer, v169 integer, v170 integer, v171 integer, v172 integer, v173 integer, v174 integer,
v175 integer, day date) WITH (appendonly=true, orientation=parquet)
>> ( PARTITION JAN START (date '2015-01-01') INCLUSIVE ,
>> PARTITION FEB START (date '2015-02-01') INCLUSIVE ,
>> PARTITION MAR START (date '2015-03-01') INCLUSIVE ,
>> PARTITION APR START (date '2015-04-01') INCLUSIVE ,
>> PARTITION MAY START (date '2015-05-01') INCLUSIVE ,
>> PARTITION JUN START (date '2015-06-01') INCLUSIVE ,
>> PARTITION JUL START (date '2015-07-01') INCLUSIVE ,
>> PARTITION AUG START (date '2015-08-01') INCLUSIVE ,
>> PARTITION SEP START (date '2015-09-01') INCLUSIVE ,
>> PARTITION OCT START (date '2015-10-01') INCLUSIVE ,
>> PARTITION NOV START (date '2015-11-01') INCLUSIVE ,
>> PARTITION DEC START (date '2015-12-01') INCLUSIVE
>>                   END (date '2016-01-01') EXCLUSIVE );
>> i.e the table has 12 partitions, and 1 month data in each partition.
>> And I INSERT the table with the sql
>> INSERT INTO zhongxing_2015_obj_12months SELECT  RegionID, eNodeBID, CellID, Time,
v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15, v16, v17, v18, v19, v20,
v21, v22, v23, v24, v25, v26, v27, v28, v29, v30, v31, v32, v33, v34, v35, v36, v37, v38,
v39, v40, v41, v42, v43, v44, v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55, v56,
v57, v58, v59, v60, v61, v62, v63, v64, v65, v66, v67, v68, v69, v70, v71, v72, v73, v74,
v75, v76, v77, v78, v79, v80, v81, v82, v83, v84, v85, v86, v87, v88, v89, v90, v91, v92,
v93, v94, v95, v96, v97, v98, v99, v100, v101, v102, v103, v104, v105, v106, v107, v108, v109,
v110, v111, v112, v113, v114, v115, v116, v117, v118, v119, v120, v121, v122, v123, v124,
v125, v126, v127, v128, v129, v130, v131, v132, v133, v134, v135, v136, v137, v138, v139,
v140, v141, v142, v143, v144, v145, v146, v147, v148, v149, v150, v151, v152, v153, v154,
v155, v156, v157, v158, v159, v160, v161, v162, v163, v164, v165, v166, v167, v168, v169,
v170, v171, v172, v173, v174, v175, to_date(substring(Time from 1 for 6),'yyMMdd') AS day
FROM zhongxing_2015;
>> but when I query one partition with the sql
>> select
>> sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1,
>> sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2,
>> sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3,
>> sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4,
>> sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5,
>> sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6,
>> sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7,
>> sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8,
>> sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9,
>> sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10,
>> sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11,
>> sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12,
>> sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13,
>> sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14,
>> sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15,
>> sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16,
>> sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17,
>> sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18,
>> sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19,
>> sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20,
>> sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21,
>> sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22,
>> sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23,
>> sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24,
>> sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25,
>> sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26,
>> sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27,
>> sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28,
>> sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29,
>> sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30,
>> sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31,
>> sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32,
>> sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33,
>> sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34,
>> sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35,
>> substr(time,1,4) as month,
>> eNodeBID,
>> CellID
>> from zhongxing_2015_obj_12months
>> where (day >= '2015-03-01' and day <= '2015-03-31')
>> and (Time between '1503010000' and '1503312345')
>> and RegionID=999
>> group by eNodeBID,CellID,month
>> order by P1 DESC limit 50;
>> I found the performance is not good, and when I substitute the table name ‘zhongxing_2015_obj_12months’
with the sub partition table name ‘zhongxing_2015_obj_12months_1_prt_mar’, I got almost
1x performance improvement. 
>> I think for the partitioned table, this two sql will be parsed to the same thing.
So I wonder why this situation could be happened. I run the explain query, and the result
as follows:
           QUERY PLAN                                                                    
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=1725101.68..1725102.80 rows=50 width=324)
>>    ->  Gather Motion 192:1  (slice2; segments: 192)  (cost=1725101.68..1725102.80
rows=50 width=324)
>>          Merge Key: p1
>>          ->  Limit  (cost=1725101.68..1725101.80 rows=1 width=324)
>>                ->  Sort  (cost=1725101.68..1725113.19 rows=24 width=324)
>>                      Sort Key (Limit): p1
>>                      ->  HashAggregate  (cost=1721149.86..1724821.55 rows=24 width=324)
>>                            Group By: public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid, "?column3?"
>>                            ->  Redistribute Motion 192:192  (slice1; segments:
192)  (cost=1718939.94..1719055.04 rows=24 width=1444)
>>                                  Hash Key: public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
>>                                  ->  HashAggregate  (cost=1718939.94..1718962.96
rows=24 width=1444)
>>                                        Group By: public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid, substr(public.zhongxing_2015_obj_12months.day::text,
1, 7)
>>                                        ->  Result  (cost=0.00..1698454.59 rows=240
>>                                              ->  Append  (cost=0.00..1698224.41
rows=240 width=716)
>>                                                    ->  Parquet table Scan on zhongxing_2015_obj_12months_1_prt_mar
zhongxing_2015_obj_12months  (cost=0.00..1698224.41 rows=240 width=716)
>>                                                          Filter: day = '2015-03-01
00:00:00'::timestamp without time zone
>> I can see the “Parquet table Scan on zhongxing_2015_obj12months_1_prt_mar zhongxing_2015_obj_12months”
part, dose this mean scanner scanning ALL DATA  of all partitions in table zhongxing_2015_obj_12months
plus one sub table zhongxing_2015_obj_12months_1_prt_mar include one partition of 12?
>> I will waiting for the answer online, this problem is very critical and urgent for
>> Thank you so mush!!!
>> Best Regards
>> Rui Zha
>  <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
受 Avast 保护的无病毒计算机已发送该电子邮件。 
> www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>

View raw message