drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3710) Make the 20 in-list optimization configurable
Date Fri, 22 Jul 2016 19:39:20 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15390089#comment-15390089
] 

ASF GitHub Bot commented on DRILL-3710:
---------------------------------------

Github user gparai commented on a diff in the pull request:

    https://github.com/apache/drill/pull/552#discussion_r71934130
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestPartitionFilter.java ---
    @@ -376,4 +376,14 @@ public void testPartitionFilterWithLike() throws Exception {
         testIncludeFilter(query4, 4, "Filter", 16);
       }
     
    +  @Test //DRILL-3710 Partition pruning should occur with varying IN-LIST size
    +  public void testPartitionFilterWithInSubquery() throws Exception {
    +    String query = String.format("select * from dfs_test.`%s/multilevel/parquet` where
cast (dir0 as int) IN (1994, 1994, 1994, 1994, 1994, 1994)", TEST_RES_PATH);
    +    /* In list size exceeds threshold - no partition pruning since predicate converted
to join */
    +    test("alter session set `planner.in_subquery_threshold` = 2");
    --- End diff --
    
    A bug could cause us to not obey the option at all i.e. we always do partition pruning
regardless of the option setting. This unit test checks we do obey the option.


> Make the 20 in-list optimization configurable
> ---------------------------------------------
>
>                 Key: DRILL-3710
>                 URL: https://issues.apache.org/jira/browse/DRILL-3710
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>    Affects Versions: 1.1.0
>            Reporter: Hao Zhu
>            Assignee: Gautam Kumar Parai
>             Fix For: Future
>
>
> If Drill has more than 20 in-lists , Drill can do an optimization to convert that in-lists
into a small hash table in memory, and then do a table join instead.
> This can improve the performance of the query which has many in-lists.
> Could we make "20" configurable? So that we do not need to add duplicate/junk in-list
to make it more than 20.
> Sample query is :
> select count(*) from table where col in (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message