impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Behm (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-5381) Add query option to control join strategy when tables have no stats
Date Mon, 05 Jun 2017 18:44:04 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander Behm resolved IMPALA-5381.
------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0

commit ecda49f3e3001e23bebd6bdfaa1c612716df4bf1
Author: Alex Behm <alex.behm@cloudera.com>
Date:   Thu Jun 1 18:39:43 2017 -0700

    IMPALA-5381: Adds DEFAULT_JOIN_DISTRIBUTION_MODE query option.
    
    Adds a new query option DEFAULT_JOIN_DISTRIBUTION_MODE to
    control which join distribution mode is chosen when the join
    inputs have an unknown cardinality (e.g., missing stats) or when
    the expected costs of the different strategies are equal.
    
    Values for DEFAULT_JOIN_DISTRIBUTION_MODE: [BROADCAST, SHUFFLE]
    Default: BROADCAST
    
    Note that this change effectively undoes IMPALA-5120.
    
    Testing:
    - Added new planner tests
    - Core/hdfs run passed
    
    Change-Id: Ibd34442f422129d53bef5493fc9cbe7375a0765c
    Reviewed-on: http://gerrit.cloudera.org:8080/7059
    Reviewed-by: Alex Behm <alex.behm@cloudera.com>
    Tested-by: Impala Public Jenkins


> Add query option to control join strategy when tables have no stats
> -------------------------------------------------------------------
>
>                 Key: IMPALA-5381
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5381
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>            Reporter: Greg Rahn
>            Assignee: Alexander Behm
>            Priority: Critical
>             Fix For: Impala 2.9.0
>
>
> In IMPALA-5120 the join strategy was changed from bcast to shuffle when tables have no
stats.  Adding a query option to specify the behavior lowers the risk that users may have
come to rely on this behavior.  This would allow them to revert back to the previous behavior.
> Query option proposal:
> {noformat}
> default_join_distribution_mode = [ broadcast | shuffle ] 
> {noformat}
> Ideally, the default would be shuffle, but in the spirit of preserving existing behavior
it will stay broadcast. We should re-evaluate this choice in a compatibility-breaking release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message