drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5468) TPCH Q18 regressed ~3x due to execution plan changes
Date Thu, 13 Jul 2017 22:50:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086552#comment-16086552

Aman Sinha commented on DRILL-5468:

[~jni] Option 1 would be my preference for 1.11 .. it can be done only for q18.  Increasing
the default broadcast threshold in the code requires lot more testing, plus it will cause
baseline plan changes.  Agree that proper rowcount estimation is needed and we should re-visit
that as part of DRILL-1328.  
We have to also set expectations that the default value of 10M for broadcast is just an arbitrary
default...it needs to be calibrated based on cluster size, average row width of data being
sent over the network etc.  

> TPCH Q18 regressed ~3x due to execution plan changes
> ----------------------------------------------------
>                 Key: DRILL-5468
>                 URL: https://issues.apache.org/jira/browse/DRILL-5468
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.11.0
>         Environment: 10+1 node ucs-micro cluster RHEL6.4
>            Reporter: Dechang Gu
>            Assignee: Jinfeng Ni
>             Fix For: 1.11.0
>         Attachments: Q18_profile_gitid_841ead4, Q18_profile_gitid_adbf363
> In a regular regression test on Drill master (commit id 841ead4) TPCH Q18 on SF100 parquet
dataset took ~81 secs, while the same query on 1.10.0 took only ~27 secs.  The query time
on the commit adbf363 which is right before 841ead4 is ~32 secs.
> Profiles shows the plans for the query changed quite a bit (profiles will be uploaded)

This message was sent by Atlassian JIRA

View raw message