Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6D85F200B48 for ; Mon, 18 Jul 2016 21:02:27 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6C317160A87; Mon, 18 Jul 2016 19:02:27 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BCDFC160A65 for ; Mon, 18 Jul 2016 21:02:26 +0200 (CEST) Received: (qmail 76614 invoked by uid 500); 18 Jul 2016 19:02:20 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 76426 invoked by uid 99); 18 Jul 2016 19:02:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jul 2016 19:02:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id AFF092C036F for ; Mon, 18 Jul 2016 19:02:20 +0000 (UTC) Date: Mon, 18 Jul 2016 19:02:20 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-4743) HashJoin's not fully parallelized in query plan MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 18 Jul 2016 19:02:27 -0000 [ https://issues.apache.org/jira/browse/DRILL-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382838#comment-15382838 ] ASF GitHub Bot commented on DRILL-4743: --------------------------------------- Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/534#discussion_r71207858 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java --- @@ -212,6 +219,14 @@ public static long getInitialPlanningMemorySize() { return INITIAL_OFF_HEAP_ALLOCATION_IN_BYTES; } + public double getFilterMinSelectivityEstimateFactor() { + return options.getOption(FILTER_MIN_SELECTIVITY_ESTIMATE_FACTOR.getOptionName()).float_val; --- End diff -- Make the option validators above typed: `public static final FloatValidator FILTER_MIN_SELECTIVITY_ESTIMATE_FACTOR ...` and change this line to: `return options.getOption(FILTER_MIN_SELECTIVITY_ESTIMATE_FACTOR);` Same for the other option. > HashJoin's not fully parallelized in query plan > ----------------------------------------------- > > Key: DRILL-4743 > URL: https://issues.apache.org/jira/browse/DRILL-4743 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.5.0 > Reporter: Gautam Kumar Parai > Assignee: Gautam Kumar Parai > Labels: doc-impacting > > The underlying problem is filter selectivity under-estimate for a query with complicated predicates e.g. deeply nested and/or predicates. This leads to under parallelization of the major fragment doing the join. > To really resolve this problem we need table/column statistics to correctly estimate the selectivity. However, in the absence of statistics OR even when existing statistics are insufficient to get a correct estimate of selectivity this will serve as a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)