hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15489) Alternatively use table scan stats for HoS
Date Fri, 10 Feb 2017 23:06:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861978#comment-15861978
] 

Xuefu Zhang commented on HIVE-15489:
------------------------------------

{quote}
I've thought about this. The downside is many good cases will be turned to reduce join as
well. But I think this config is mainly for stability, so it should be fine, as long as we
document this well. Will add to next patch.
{quote}
My concern is that the map joins down below may also suffer the consequence of inaccurate
stats.

{quote}
Do you think we should combine these two? since they are similar.
{quote}
It's probably better to have two as they control behaviors on different functionality.

> Alternatively use table scan stats for HoS
> ------------------------------------------
>
>                 Key: HIVE-15489
>                 URL: https://issues.apache.org/jira/browse/HIVE-15489
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark, Statistics
>    Affects Versions: 2.2.0
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>         Attachments: HIVE-15489.1.patch, HIVE-15489.2.patch, HIVE-15489.3.patch, HIVE-15489.4.patch,
HIVE-15489.wip.patch
>
>
> For MapJoin in HoS, we should provide an option to only use stats in the TS rather than
the populated stats in each of the join branch. This could be pretty conservative but more
reliable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message