From issues-return-41606-archive-asf-public=cust-asf.ponee.io@kylin.apache.org Fri Dec 4 07:57:02 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 23FC6180674 for ; Fri, 4 Dec 2020 08:57:02 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id 69816457AA for ; Fri, 4 Dec 2020 07:57:01 +0000 (UTC) Received: (qmail 34897 invoked by uid 500); 4 Dec 2020 07:57:01 -0000 Mailing-List: contact issues-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kylin.apache.org Delivered-To: mailing list issues@kylin.apache.org Received: (qmail 34780 invoked by uid 99); 4 Dec 2020 07:57:01 -0000 Received: from ec2-52-204-25-47.compute-1.amazonaws.com (HELO mailrelay1-ec2-va.apache.org) (52.204.25.47) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2020 07:57:01 +0000 Received: from jira2-he-de.apache.org (jira2-he-de.apache.org [168.119.33.54]) by mailrelay1-ec2-va.apache.org (ASF Mail Server at mailrelay1-ec2-va.apache.org) with ESMTPS id EDC953EAFA for ; Fri, 4 Dec 2020 07:57:00 +0000 (UTC) Received: from jira2-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira2-he-de.apache.org (ASF Mail Server at jira2-he-de.apache.org) with ESMTP id 24B26C806DA for ; Fri, 4 Dec 2020 07:57:00 +0000 (UTC) Date: Fri, 4 Dec 2020 07:57:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@kylin.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KYLIN-4829) Support to use thread-level SparkSession to execute query MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KYLIN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243804#comment-17243804 ] ASF GitHub Bot commented on KYLIN-4829: --------------------------------------- zzcclp commented on pull request #1495: URL: https://github.com/apache/kylin/pull/1495#issuecomment-738628576 ## The Results of Testing Manually ### Test Env - Hadoop 2.7.0 on docker. - Commit : [3b3786c5c](https://github.com/apache/kylin/commit/3b3786c5c9602838cd4abd0a6d40574550ec8622) - Sparder Env : spark.executor.cores=1 spark.executor.instances=4 spark.executor.memory=2G spark.executor.memoryOverhead=1G spark.sql.shuffle.partitions=4 ### Before this patch The shuffle partition number of all querys is 4, which equals to the total cores number. ![image](https://user-images.githubusercontent.com/9430290/101136306-19016880-3648-11eb-8ae0-2e02d42a41ac.png) ![image](https://user-images.githubusercontent.com/9430290/101136373-32a2b000-3648-11eb-83dd-83b52e2d9980.png) ![image](https://user-images.githubusercontent.com/9430290/101136443-4e0dbb00-3648-11eb-8d31-ac721797ee94.png) ![image](https://user-images.githubusercontent.com/9430290/101136476-5a921380-3648-11eb-90f7-ec20faeca57b.png) ### After this patch The shuffle partition number of each query is calculated according to the scanned bytes of each query: ![image](https://user-images.githubusercontent.com/9430290/101136174-e3f51600-3647-11eb-99c9-290831bb30af.png) ![image](https://user-images.githubusercontent.com/9430290/101136210-f40cf580-3647-11eb-8eec-0b7bdef93c30.png) ![image](https://user-images.githubusercontent.com/9430290/101136227-fa02d680-3647-11eb-9001-128c0e3bd490.png) ![image](https://user-images.githubusercontent.com/9430290/101136249-05ee9880-3648-11eb-9267-c1cb44319697.png) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org > Support to use thread-level SparkSession to execute query > ---------------------------------------------------------- > > Key: KYLIN-4829 > URL: https://issues.apache.org/jira/browse/KYLIN-4829 > Project: Kylin > Issue Type: Improvement > Components: Query Engine, Spark Engine > Reporter: Zhichao Zhang > Assignee: Zhichao Zhang > Priority: Minor > Fix For: v4.0.0-beta > > > Currently, when executing a query, it is impossible to configure proper parameters for each query according to the data will be scanned, such as spark.sql.shuffle.partitions, this will impact the performance of querying. -- This message was sent by Atlassian Jira (v8.3.4#803005)