hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Panagiotis Garefalakis (Jira)" <>
Subject [jira] [Updated] (HIVE-23006) Basic compiler support for Probe MapJoin
Date Thu, 09 Apr 2020 17:29:00 GMT


Panagiotis Garefalakis updated HIVE-23006:
    Summary: Basic compiler support for Probe MapJoin  (was: Compiler support for Probe MapJoin)

> Basic compiler support for Probe MapJoin
> ----------------------------------------
>                 Key: HIVE-23006
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-23006.01.patch, HIVE-23006.02.patch, HIVE-23006.03.patch
>          Time Spent: 3h
>  Remaining Estimate: 0h
> The decision of pushing down information to the Record reader (potentially reducing decoding
time by row-level filtering) should be done at query compilation time.
> This patch adds an extra optimisation step with the goal of finding Table Scan operators
that could reduce the number of rows decoded at runtime using extra available information.
> It currently looks for all the available MapJoin operators that could use the smaller
HashTable on the probing side (where TS is) to filter-out rows that would never match. 
> To do so the HashTable information is pushed down to the TS properties and then propagated
as part of MapWork.
> If the a single TS is used by multiple operators (shared-word), this rule can not be
> This rule can be extended to support static filter expressions like:
> _select * from sales where sold_state = 'PR';_
> This optimisation manly targets the Tez execution engine running on Llap.

This message was sent by Atlassian Jira

View raw message