drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5468) TPCH Q18 regressed ~3x due to execution plan changes
Date Thu, 13 Jul 2017 22:32:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086530#comment-16086530
] 

Jinfeng Ni commented on DRILL-5468:
-----------------------------------

[~amansinha100], turns out that the reason Drill did not pick broadcast based join is because
the RHS's row count is higher than the default broadcast threshold (10M).  

On master branch, the logical plan looks like as following. The row count is estimated as
600M *  1/10 * 0.5 = 30M.  

{code}
Filter
  \
   Agg
    \ 
    Scan (Lineitem) 
{code}

Prior to DRILL-4678, the logical plan looks like as following.  The row count of RHS is estimated
as  600M * 1/10 * 0.5 * 1/10 = 3M, which is below the threshold. 
{code}
Agg
 \
 Filter
    \
    Agg
       \ 
     Scan (Lineitem) 
{code}

If I increase the broadcast threshold from 10M to 35M, then I get the broadcast join, and
TPCH Q18's 3X regression is gone. 

Previously, I got the impression of 300k vs 3M. That's because I was using a tpch scale factor
10 in my debugging.  The physical plan on SF100 also shows 300k vs 3M, since in both plans
the two phases aggregation kick in, and reduced the rowcount by another 1/10. 


> TPCH Q18 regressed ~3x due to execution plan changes
> ----------------------------------------------------
>
>                 Key: DRILL-5468
>                 URL: https://issues.apache.org/jira/browse/DRILL-5468
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.11.0
>         Environment: 10+1 node ucs-micro cluster RHEL6.4
>            Reporter: Dechang Gu
>            Assignee: Jinfeng Ni
>             Fix For: 1.11.0
>
>         Attachments: Q18_profile_gitid_841ead4, Q18_profile_gitid_adbf363
>
>
> In a regular regression test on Drill master (commit id 841ead4) TPCH Q18 on SF100 parquet
dataset took ~81 secs, while the same query on 1.10.0 took only ~27 secs.  The query time
on the commit adbf363 which is right before 841ead4 is ~32 secs.
> Profiles shows the plans for the query changed quite a bit (profiles will be uploaded)




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message