spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenchen Fan (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-22310) Refactor join estimation to incorporate estimation logic for different kinds of statistics
Date Tue, 31 Oct 2017 10:15:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-22310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wenchen Fan resolved SPARK-22310.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

> Refactor join estimation to incorporate estimation logic for different kinds of statistics
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-22310
>                 URL: https://issues.apache.org/jira/browse/SPARK-22310
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Zhenhua Wang
>            Assignee: Zhenhua Wang
>             Fix For: 2.3.0
>
>
> The current join estimation logic is only based on basic column statistics (such as ndv,
etc). If we want to add estimation for other kinds of statistics (such as histograms), it's
not easy to incorporate into the current algorithm:
> 1. When we have multiple pairs of join keys, the current algorithm computes cardinality
in a single formula. But if different join keys have different kinds of stats, the computation
logic for each pair of join keys become different, so the previous formula does not apply.
> 2. Currently it computes cardinality and updates join keys' column stats separately.
It's better to do these two steps together, since both computation and update logic are different
for different kinds of stats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message