hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish Butani (JIRA)" <>
Subject [jira] [Created] (HIVE-7905) CBO: more cost model changes
Date Fri, 29 Aug 2014 01:47:09 GMT
Harish Butani created HIVE-7905:

             Summary: CBO: more cost model changes
                 Key: HIVE-7905
             Project: Hive
          Issue Type: Sub-task
            Reporter: Harish Butani
            Assignee: Harish Butani

1. For composite predicates smoothen the Selectivity calculation using +exponential backoff+.
Thanks to [~ mmokhtar] for this formula.

Can you change the algorithm to use exponential back-off  :
ndv(pe0) * ndv(pe1) ^(1/2)  * ndv(pe2) ^(1/4)  * ndv(pe3) ^(1/8)

Opposed to :


If we assume selectivity of 0.7 for each store_sales join then join selectivity can end up
being 6.24285E-05 which is too low and eventually results in an un-optimal plan.

See attached picture.

2. In case of Fact - Dim joins on the Dim primary key we infer the Join cardinality as a filter
on the Fact table:
join card = rowCount(Fact table) * selectivity(dim table)

Whether a Column is a Key is inferred based on either:
* table rowCount = column ndv
* (tbd shortly) table rowCount = (maxVal - minVal)

This message was sent by Atlassian JIRA

View raw message