From user-return-530-archive-asf-public=cust-asf.ponee.io@impala.apache.org Tue Jan 9 07:17:23 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 4467E180718 for ; Tue, 9 Jan 2018 07:17:23 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 341B7160C3E; Tue, 9 Jan 2018 06:17:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 03A02160C2C for ; Tue, 9 Jan 2018 07:17:21 +0100 (CET) Received: (qmail 27498 invoked by uid 500); 9 Jan 2018 06:17:21 -0000 Mailing-List: contact user-help@impala.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@impala.apache.org Delivered-To: mailing list user@impala.apache.org Received: (qmail 27487 invoked by uid 99); 9 Jan 2018 06:17:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jan 2018 06:17:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A9C681A0394 for ; Tue, 9 Jan 2018 06:17:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.275 X-Spam-Level: ** X-Spam-Status: No, score=2.275 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FROM_MISSPACED=0.001, FROM_MISSP_EH_MATCH=0.001, KAM_ASCII_DIVIDERS=0.8, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_LOW=-0.7, SPF_SOFTFAIL=0.972] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id VidBJKyfARgQ for ; Tue, 9 Jan 2018 06:17:18 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 3F2B15FB1F for ; Tue, 9 Jan 2018 06:17:17 +0000 (UTC) Received: from localhost (cust-asf2.ponee.io [163.172.22.184]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 65325E031E for ; Tue, 9 Jan 2018 06:17:16 +0000 (UTC) MIME-Version: 1.0 Message-ID: Subject: Re: Could anybody help to explain why there is such a big gap in 'ProbeTime' and How to fix this gap? References: <009701d38466$329061d0$97b12570$@corp.netease.com> From: "helifu@gmail.com" In-Reply-To: Content-Type: text/plain; charset="iso-8859-1" x-ponymail-sender: ed7ffeeab3277eb04c292af147bc9cae36deb168 Date: Tue, 09 Jan 2018 06:17:15 -0000 x-ponymail-agent: PonyMail Composer/0.2 To: X-Mailer: LuaSocket 3.0-rc1 The cpu and memory are sufficient. Only one core is 100% during PHJ: ----->top + H PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 146455 impala 20 0 34.534g 2.130g 80992 R 99.9 0.8 0:05.94 impalad 51161 kudu 20 0 50.041g 0.021t 36788 S 4.9 8.6 672:05.66 rpc reactor-511 51177 kudu 20 0 50.041g 0.021t 36788 S 4.9 8.6 0:37.49 rpc worker-5117 51163 kudu 20 0 50.041g 0.021t 36788 S 4.6 8.6 417:25.94 rpc reactor-511 51185 kudu 20 0 50.041g 0.021t 36788 S 3.9 8.6 0:03.43 rpc worker-5118 51337 kudu 20 0 50.041g 0.021t 36788 S 3.9 8.6 417:15.66 maintenance_sch 114193 impala 20 0 34.534g 2.130g 80992 S 3.9 0.8 0:03.96 rpc reactor-114 51191 kudu 20 0 50.041g 0.021t 36788 S 3.6 8.6 0:20.62 rpc worker-5119 51190 kudu 20 0 50.041g 0.021t 36788 S 2.9 8.6 0:05.81 rpc worker-5119 51162 kudu 20 0 50.041g 0.021t 36788 S 2.6 8.6 423:10.56 rpc reac And in order to improve efficiency, i changed the sql to: select count(*) from supplier as s, partsupp as p where s.s_suppkey = p.ps_suppkey; Then i used 'sar' and 'perf' to capture something, and found that 'page-faults' of kudu is almost twice of parquet though i don't know whether it is valuable. kudu(sar) =================================================>>> root@hadoop958:~# sar -B 1 1000 Linux 3.16.0-4-amd64 (hadoop958.hz.163.org) 01/08/2018 _x86_64_ (40 CPU) 05:21:21 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 05:21:22 PM 0.00 0.00 25.00 0.00 50.00 0.00 0.00 0.00 0.00 05:21:23 PM 0.00 0.00 24.00 0.00 49.00 0.00 0.00 0.00 0.00 05:21:24 PM 0.00 0.00 2385.00 0.00 516.00 0.00 0.00 0.00 0.00 05:21:25 PM 0.00 0.00 29041.00 0.00 20017.00 0.00 0.00 0.00 0.00 05:21:26 PM 0.00 0.00 40957.00 0.00 14432.00 0.00 0.00 0.00 0.00 05:21:27 PM 0.00 32.00 45776.00 0.00 1433.00 0.00 0.00 0.00 0.00 05:21:28 PM 0.00 0.00 12949.00 0.00 20950.00 0.00 0.00 0.00 0.00 05:21:29 PM 0.00 0.00 11335.00 0.00 22574.00 0.00 0.00 0.00 0.00 05:21:30 PM 0.00 100.00 15852.00 0.00 27227.00 0.00 0.00 0.00 0.00 05:21:31 PM 0.00 0.00 17678.00 0.00 29853.00 0.00 0.00 0.00 0.00 05:21:32 PM 0.00 0.00 12681.00 0.00 20305.00 0.00 0.00 0.00 0.00 05:21:33 PM 0.00 0.00 11062.00 0.00 20368.00 0.00 0.00 0.00 0.00 05:21:34 PM 0.00 0.00 11891.00 0.00 20317.00 0.00 0.00 0.00 0.00 05:21:35 PM 0.00 12.00 12495.00 0.00 20329.00 0.00 0.00 0.00 0.00 05:21:36 PM 0.00 68.00 10453.00 0.00 20320.00 0.00 0.00 0.00 0.00 05:21:37 PM 0.00 0.00 10313.00 0.00 20309.00 0.00 0.00 0.00 0.00 05:21:38 PM 0.00 0.00 10738.00 0.00 20350.00 0.00 0.00 0.00 0.00 05:21:39 PM 0.00 0.00 12216.00 0.00 20398.00 0.00 0.00 0.00 0.00 05:21:40 PM 0.00 0.00 12834.00 0.00 21116.00 0.00 0.00 0.00 0.00 05:21:41 PM 0.00 4.00 12497.00 0.00 20319.00 0.00 0.00 0.00 0.00 05:21:42 PM 0.00 16.00 11831.00 0.00 20376.00 0.00 0.00 0.00 0.00 05:21:43 PM 0.00 0.00 14087.00 0.00 24922.00 0.00 0.00 0.00 0.00 05:21:44 PM 0.00 0.00 17885.00 0.00 24956.00 0.00 0.00 0.00 0.00 05:21:45 PM 0.00 28.00 19622.00 0.00 24578.00 0.00 0.00 0.00 0.00 05:21:46 PM 0.00 4.00 11262.00 0.00 20351.00 0.00 0.00 0.00 0.00 05:21:47 PM 0.00 0.00 13248.00 0.00 21835.00 0.00 0.00 0.00 0.00 05:21:48 PM 0.00 24.00 12915.00 0.00 20322.00 0.00 0.00 0.00 0.00 05:21:49 PM 0.00 0.00 13170.00 0.00 21733.00 0.00 0.00 0.00 0.00 05:21:50 PM 0.00 20.00 13614.00 0.00 20660.00 0.00 0.00 0.00 0.00 05:21:51 PM 0.00 8.00 20138.00 0.00 20972.00 0.00 0.00 0.00 0.00 05:21:52 PM 0.00 0.00 13014.00 0.00 20372.00 0.00 0.00 0.00 0.00 05:21:53 PM 0.00 0.00 11716.00 0.00 20336.00 0.00 0.00 0.00 0.00 05:21:54 PM 0.00 12.00 10320.00 0.00 20442.00 0.00 0.00 0.00 0.00 05:21:54 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 05:21:55 PM 0.00 0.00 19144.00 0.00 29316.00 0.00 0.00 0.00 0.00 05:21:56 PM 0.00 96.00 10316.00 0.00 20297.00 0.00 0.00 0.00 0.00 05:21:57 PM 0.00 0.00 10310.00 0.00 20395.00 0.00 0.00 0.00 0.00 05:21:58 PM 0.00 0.00 10583.00 0.00 20864.00 0.00 0.00 0.00 0.00 05:21:59 PM 0.00 0.00 11559.00 0.00 21296.00 0.00 0.00 0.00 0.00 05:22:00 PM 0.00 0.00 10327.00 0.00 20313.00 0.00 0.00 0.00 0.00 05:22:01 PM 0.00 132.00 10406.00 0.00 29654.00 0.00 0.00 0.00 0.00 05:22:02 PM 0.00 12.00 11089.00 0.00 21865.00 0.00 0.00 0.00 0.00 05:22:03 PM 0.00 0.00 11085.00 0.00 21675.00 0.00 0.00 0.00 0.00 05:22:04 PM 0.00 0.00 11332.00 0.00 22583.00 0.00 0.00 0.00 0.00 05:22:05 PM 0.00 0.00 10820.00 0.00 21572.00 0.00 0.00 0.00 0.00 05:22:06 PM 0.00 40.00 15241.00 0.00 28091.00 0.00 0.00 0.00 0.00 05:22:07 PM 0.00 4.00 12533.00 0.00 24664.00 0.00 0.00 0.00 0.00 05:22:08 PM 0.00 0.00 12149.00 0.00 23097.00 0.00 0.00 0.00 0.00 05:22:09 PM 0.00 0.00 11511.00 0.00 21385.00 0.00 0.00 0.00 0.00 05:22:10 PM 0.00 0.00 10339.00 0.00 20445.00 0.00 0.00 0.00 0.00 05:22:11 PM 0.00 16.00 10306.00 0.00 20319.00 0.00 0.00 0.00 0.00 05:22:12 PM 0.00 16.00 10306.00 0.00 20382.00 0.00 0.00 0.00 0.00 05:22:13 PM 0.00 0.00 10312.00 0.00 20299.00 0.00 0.00 0.00 0.00 05:22:14 PM 0.00 0.00 10700.00 0.00 20363.00 0.00 0.00 0.00 0.00 05:22:15 PM 0.00 0.00 8725.00 0.00 18867.00 0.00 0.00 0.00 0.00 05:22:16 PM 0.00 20.00 55.00 0.00 96883.00 0.00 0.00 0.00 0.00 05:22:17 PM 0.00 0.00 24.00 0.00 53.00 0.00 0.00 0.00 0.00 05:22:18 PM 0.00 0.00 24.00 0.00 107.00 0.00 0.00 0.00 0.00 05:22:19 PM 0.00 0.00 1287.00 0.00 1136.00 0.00 0.00 0.00 0.00 05:22:20 PM 0.00 0.00 31.00 0.00 66.00 0.00 0.00 0.00 0.00 05:22:21 PM 0.00 64.00 24.00 0.00 58.00 0.00 0.00 0.00 0.00 05:22:22 PM 0.00 0.00 24.00 0.00 88.00 0.00 0.00 0.00 0.00 05:22:23 PM 0.00 0.00 24.00 0.00 89.00 0.00 0.00 0.00 0.00 05:22:24 PM 0.00 0.00 24.00 0.00 49.00 0.00 0.00 0.00 0.00 05:22:25 PM 0.00 0.00 24.00 0.00 49.00 0.00 0.00 0.00 0.00 05:22:26 PM 0.00 0.00 24.00 0.00 93.00 0.00 0.00 0.00 0.00 05:22:27 PM 0.00 4.00 24.00 0.00 97.00 0.00 0.00 0.00 0.00 parquet(sar) =============================================>>> root@hadoop958:~# sar -B 1 1000 Linux 3.16.0-4-amd64 (hadoop958.hz.163.org) 01/08/2018 _x86_64_ (40 CPU) 04:37:37 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 04:37:38 PM 0.00 16.00 25.00 0.00 57.00 0.00 0.00 0.00 0.00 04:37:39 PM 0.00 0.00 14975.00 0.00 2585.00 0.00 0.00 0.00 0.00 04:37:40 PM 0.00 0.00 29143.00 0.00 6272.00 0.00 0.00 0.00 0.00 04:37:41 PM 0.00 0.00 67780.00 0.00 7579.00 0.00 0.00 0.00 0.00 04:37:42 PM 0.00 8.00 3734.00 0.00 4083.00 0.00 0.00 0.00 0.00 04:37:43 PM 0.00 60.00 7019.00 0.00 8188.00 0.00 0.00 0.00 0.00 04:37:44 PM 0.00 24.00 11843.00 0.00 8926.00 0.00 0.00 0.00 0.00 04:37:45 PM 0.00 0.00 9809.00 0.00 8289.00 0.00 0.00 0.00 0.00 04:37:46 PM 0.00 0.00 3792.00 0.00 4031.00 0.00 0.00 0.00 0.00 04:37:47 PM 0.00 0.00 3809.00 0.00 4057.00 0.00 0.00 0.00 0.00 04:37:48 PM 0.00 24.00 3796.00 0.00 4065.00 0.00 0.00 0.00 0.00 04:37:49 PM 0.00 0.00 6021.00 0.00 5349.00 0.00 0.00 0.00 0.00 04:37:50 PM 0.00 0.00 3762.00 0.00 4017.00 0.00 0.00 0.00 0.00 04:37:51 PM 0.00 0.00 13333.00 0.00 4450.00 0.00 0.00 0.00 0.00 04:37:52 PM 0.00 8.00 3839.00 0.00 4095.00 0.00 0.00 0.00 0.00 04:37:53 PM 0.00 0.00 3873.00 0.00 4134.00 0.00 0.00 0.00 0.00 04:37:54 PM 0.00 28.00 3763.00 0.00 3992.00 0.00 0.00 0.00 0.00 04:37:55 PM 0.00 0.00 12371.00 0.00 12656.00 0.00 0.00 0.00 0.00 04:37:56 PM 0.00 0.00 3731.00 0.00 60638.00 0.00 0.00 0.00 0.00 04:37:57 PM 0.00 104.00 3866.00 0.00 3835.00 0.00 0.00 0.00 0.00 04:37:58 PM 0.00 0.00 4116.00 0.00 4394.00 0.00 0.00 0.00 0.00 04:37:59 PM 0.00 0.00 7188.00 0.00 6622.00 0.00 0.00 0.00 0.00 04:38:00 PM 0.00 0.00 4940.00 0.00 10193.00 0.00 0.00 0.00 0.00 04:38:01 PM 0.00 60.00 133.00 0.00 9299.00 0.00 0.00 0.00 0.00 04:38:02 PM 0.00 44.00 49.00 0.00 299.00 0.00 0.00 0.00 0.00 04:38:03 PM 0.00 0.00 160.00 0.00 83.00 0.00 0.00 0.00 0.00 04:38:04 PM 0.00 0.00 24.00 0.00 64.00 0.00 0.00 0.00 0.00 04:38:05 PM 0.00 0.00 24.00 0.00 65.00 0.00 0.00 0.00 0.00 04:38:06 PM 0.00 0.00 746.00 0.00 470.00 0.00 0.00 0.00 0.00 04:38:07 PM 0.00 0.00 81.00 0.00 93162.00 0.00 0.00 0.00 0.00 04:38:08 PM 0.00 16.00 34.00 0.00 981.00 0.00 0.00 0.00 0.00 04:38:09 PM 0.00 0.00 680.00 0.00 545.00 0.00 0.00 0.00 0.00 04:38:10 PM 0.00 0.00 66.00 0.00 118.00 0.00 0.00 0.00 0.00 kudu(perf)===============================================>>> root@hadoop958:~# perf stat -p 29160 -d ^C Performance counter stats for process id '29160': 75367.439887 task-clock (msec) # 1.365 CPUs utilized 323,646 context-switches # 0.004 M/sec 8,683 cpu-migrations # 0.115 K/sec 632,653 page-faults # 0.008 M/sec 162,296,209,992 cycles # 2.153 GHz [93.18%] stalled-cycles-frontend stalled-cycles-backend 86,693,164,726 instructions # 0.53 insns per cycle [95.31%] 15,026,610,761 branches # 199.378 M/sec [95.44%] 29,680,120 branch-misses # 0.20% of all branches [95.54%] 33,528,975,631 L1-dcache-loads # 444.873 M/sec [95.91%] 2,484,449,334 L1-dcache-load-misses # 7.41% of all L1-dcache hits [11.40%] 1,465,862,774 LLC-loads # 19.450 M/sec [10.11%] LLC-load-misses:HG 55.217146805 seconds time elapsed parquet(perf) =============================================>>> root@hadoop958:~# perf stat -p 29160 -d ^C Performance counter stats for process id '29160': 51130.516140 task-clock (msec) # 1.557 CPUs utilized 316,994 context-switches # 0.006 M/sec 8,232 cpu-migrations # 0.161 K/sec 190,272 page-faults # 0.004 M/sec 102,279,158,848 cycles # 2.000 GHz [92.86%] stalled-cycles-frontend stalled-cycles-backend 94,061,719,561 instructions # 0.92 insns per cycle [95.23%] 16,498,918,033 branches # 322.682 M/sec [95.15%] 30,073,148 branch-misses # 0.18% of all branches [95.15%] 29,651,847,951 L1-dcache-loads # 579.925 M/sec [95.43%] 2,101,137,839 L1-dcache-load-misses # 7.09% of all L1-dcache hits [12.35%] 1,194,026,289 LLC-loads # 23.353 M/sec [10.25%] LLC-load-misses:HG 32.849196160 seconds time elapsed On 2018-01-04 00:20, Tim Armstrong wrote: > That's a tricky one. I have a couple of ideas but it's a bit difficult to > confirm since the profile isn't really designed to easily answer questions > like this. ProbeTime measures wall-clock time rather than actual time spent > executing on the CPU. > > My first guess is that it's because the Kudu scan is using more CPU than > the Parquet scan and ProbeTime so the thread doing the scan is competing > for resources more with the join thread (either hyperthreads competing for > resources on the same CPU or threads competing for time on logical > processors). You could compare the User CPU time for the fragment instances > containing the joins and scans to see if there is also a discrepancy in CPU > time. That won't answer this directly but might provide some clues. > > My second guess is that there's some kind of subtle memory locality or > scheduling effect. > > On Tue, Jan 2, 2018 at 11:40 PM, helifu wrote: > > > Hi everybody, > > > > > > > > Recently I ran a simple PHJ on‘parquet’ and ‘kudu’with this sql > > independently: > > > > select count(*) from lineitem as l, orders as o where l.l_orderkey = > > o.o_orderkey and o.o_orderdate < '1996-01-01' and o.o_orderdate >= > > '1995-01-01'; > > > > > > > > And I found that the ‘ProbeTime’ on ‘kudu’is much larger than on > > ‘parquet’!! > > > > > > > > Below are the plans and profiles: > > > > > > > > Thanks in advance. > > > > > > > > > > > > 何李夫 > > > > 2017-04-10 16:06:24 > > > > > > >