Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 76909200D3E for ; Thu, 2 Nov 2017 02:31:18 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7530C160BFA; Thu, 2 Nov 2017 01:31:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9BBFA160BEA for ; Thu, 2 Nov 2017 02:31:17 +0100 (CET) Received: (qmail 61167 invoked by uid 500); 2 Nov 2017 01:31:16 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 61155 invoked by uid 99); 2 Nov 2017 01:31:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Nov 2017 01:31:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id B1A131A0F55 for ; Thu, 2 Nov 2017 01:31:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.39 X-Spam-Level: X-Spam-Status: No, score=0.39 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, UPPERCASE_50_75=0.791] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id mEw7Zi4Leqz1 for ; Thu, 2 Nov 2017 01:31:13 +0000 (UTC) Received: from mail-io0-f181.google.com (mail-io0-f181.google.com [209.85.223.181]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A0DF35FE0E for ; Thu, 2 Nov 2017 01:31:13 +0000 (UTC) Received: by mail-io0-f181.google.com with SMTP id m16so10324699iod.1 for ; Wed, 01 Nov 2017 18:31:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudera.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=9CiWeCMtU/pNZIiljZnDyNOAa/GJhSLRapLGOPqdbqo=; b=VuptliTuu/hDPLSUOpsafHX9fVb2Phd4Szq3cGFiMzn5n6xxzApj1Q9gNlx28/Fy10 P03VSh9AoulSzjw5JvAnrPbj53vsOp8MrUFD1FZ5U3TO2HuurT1aZBjoBhoyTw3ubr96 FO/3EdR+F2OHzkKlGwVzeLJPlu8JLC3imwU+1ShcFEMb2gc0MQloChKoVVzZ1BAcVaJK sig4KSK89iBHsHAge92qASasPaWel4xBpXGFhgajrvJFVesG+Gkkfou3CWmMd4IpBROl Cnp2pO45W9+Bsc9vGvBT+EUcMHFE9vY3/xLwgmWCjIhffgSEgkp15pYTOR/QPleyj0KC G+mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=9CiWeCMtU/pNZIiljZnDyNOAa/GJhSLRapLGOPqdbqo=; b=tUf/uOhRXvjo/l2CnNLZeSX0LaBfSjT5RlTScoMqZoeOpocEu2OeAjivEpZY9IhiCS Ux8rU/3RQCEdtQ05VDG+VBJCzDUFO6UPX07K8ftWK4YZzm17JLW1jW7U23vxd2xPHv/+ fOoeUzsgKvAKb9u/JoHwJYKcvhNEnqP2Rl46QHXCgWNdy+yzIbYeCBBOkHpl00QNJbYl A00MR7+bmfyvC9mb9LMp+hLzkLJ1rmIESBzEnU1yb7dB/ELfgXxz3X6GSifNOB30B6iJ N2mM8cBZCjHDS/D+ZereAoQ3+fj+XAGskUl8lzum1w33sCv5V6zF8RHdKzQchwuu5bh8 fSSw== X-Gm-Message-State: AMCzsaUejVLPN/qEKPGxGoCB+uhRLcKaZDvMWgj7tHBNEG0ipsW8fT6y ABt6qez2AzFd6QMmTYcOmGjWYFOe3VUWx9mWbe1yig== X-Google-Smtp-Source: ABhQp+T2bz+2Wxk8bxuDohSTcOXPTHUAxGO2p+oS6Aso+Lh0wp7VitpVJmEFdge4DUtOZBMtMN0i13sNG6lEF/UIjhA= X-Received: by 10.36.53.15 with SMTP id k15mr365298ita.75.1509586272765; Wed, 01 Nov 2017 18:31:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.6.208 with HTTP; Wed, 1 Nov 2017 18:30:32 -0700 (PDT) In-Reply-To: References: From: Mostafa Mokhtar Date: Wed, 1 Nov 2017 18:30:32 -0700 Message-ID: Subject: Re: performance issue on big table join To: dev@impala.incubator.apache.org Content-Type: multipart/alternative; boundary="001a114a91788bfb8a055cf5f12e" archived-at: Thu, 02 Nov 2017 01:31:18 -0000 --001a114a91788bfb8a055cf5f12e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Attaching the query profile will be most helpful to investigate this issue. If you can capture the profile from the WebUI on the coordinator node it would be great. On Wed, Nov 1, 2017 at 6:22 PM, =E4=BF=8A=E6=9D=B0=E9=99=88 wrote: > Thanks Hongxu, > > Here are configurations on my cluster, most of them are default values. > Which item do you think it may impact? > > ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0] > ABORT_ON_ERROR: [0] > ALLOW_UNSUPPORTED_FORMATS: [0] > APPX_COUNT_DISTINCT: [0] > BATCH_SIZE: [0] > COMPRESSION_CODEC: [NONE] > DEBUG_ACTION: [] > DEFAULT_ORDER_BY_LIMIT: [-1] > DISABLE_CACHED_READS: [0] > DISABLE_CODEGEN: [0] > DISABLE_OUTERMOST_TOPN: [0] > DISABLE_ROW_RUNTIME_FILTERING: [0] > DISABLE_STREAMING_PREAGGREGATIONS: [0] > DISABLE_UNSAFE_SPILLS: [0] > ENABLE_EXPR_REWRITES: [1] > EXEC_SINGLE_NODE_ROWS_THRESHOLD: [100] > EXPLAIN_LEVEL: [1] > HBASE_CACHE_BLOCKS: [0] > HBASE_CACHING: [0] > MAX_BLOCK_MGR_MEMORY: [0] > MAX_ERRORS: [100] > MAX_IO_BUFFERS: [0] > MAX_NUM_RUNTIME_FILTERS: [10] > MAX_SCAN_RANGE_LENGTH: [0] > MEM_LIMIT: [0] > MT_DOP: [0] > NUM_NODES: [0] > NUM_SCANNER_THREADS: [0] > OPTIMIZE_PARTITION_KEY_SCANS: [0] > PARQUET_ANNOTATE_STRINGS_UTF8: [0] > PARQUET_FALLBACK_SCHEMA_RESOLUTION: [0] > PARQUET_FILE_SIZE: [0] > PREFETCH_MODE: [1] > QUERY_TIMEOUT_S: [0] > REPLICA_PREFERENCE: [0] > REQUEST_POOL: [] > RESERVATION_REQUEST_TIMEOUT: [0] > RM_INITIAL_MEM: [0] > RUNTIME_BLOOM_FILTER_SIZE: [1048576] > RUNTIME_FILTER_MAX_SIZE: [16777216] > RUNTIME_FILTER_MIN_SIZE: [1048576] > RUNTIME_FILTER_MODE: [2] > RUNTIME_FILTER_WAIT_TIME_MS: [0] > S3_SKIP_INSERT_STAGING: [1] > SCAN_NODE_CODEGEN_THRESHOLD: [1800000] > SCHEDULE_RANDOM_REPLICA: [0] > SCRATCH_LIMIT: [-1] > SEQ_COMPRESSION_MODE: [0] > STRICT_MODE: [0] > SUPPORT_START_OVER: [false] > SYNC_DDL: [0] > V_CPU_CORES: [0] > > 2017-10-31 15:30 GMT+08:00 Hongxu Ma : > > > Hi JJ > > Consider it only takes 3mins on SparkSQL, maybe there are some mistakes > in > > query options. > > Try run "set;" in impala-shell and check all query options, e.g: > > BATCH_SIZE: [0] > > DISABLE_CODEGEN: [0] > > RUNTIME_FILTER_MODE: GLOBAL > > > > Just a guess, thanks. > > > > =E5=9C=A8 27/10/2017 10:25, =E4=BF=8A=E6=9D=B0=E9=99=88 =E5=86=99=E9=81= =93: > > The profile file is damaged. Here is a screenshot for exec summary > > [cid:ii_j999ymep1_15f5ba563aeabb91] > > =E2=80=8B > > > > 2017-10-27 10:04 GMT+08:00 =E4=BF=8A=E6=9D=B0=E9=99=88 > njust@gmail.com>>: > > Hi Devs > > > > I met a performance issue on big table join. The query takes more than = 3 > > hours on Impala and only 3 minutes on Spark SQL on the same 5 nodes > > cluster. when running query, the left scanner and exchange node are ve= ry > > slow. Did I miss some key arguments? > > > > you can see profile file in attachment. > > > > [cid:ii_j9998pph2_15f5b92f2cf47020] > > =E2=80=8B > > -- > > Thanks & Best Regards > > > > > > > > -- > > Thanks & Best Regards > > > > > > -- > > Regards, > > Hongxu. > > > > > > -- > Thanks & Best Regards > --001a114a91788bfb8a055cf5f12e--