hawq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Radar Lei (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HAWQ-1597) Implement Runtime Filter for Hash Join
Date Fri, 20 Jul 2018 10:03:00 GMT

     [ https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Radar Lei resolved HAWQ-1597.
-----------------------------
    Resolution: Fixed

> Implement Runtime Filter for Hash Join
> --------------------------------------
>
>                 Key: HAWQ-1597
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1597
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: Query Execution
>            Reporter: Lin Wen
>            Assignee: Lin Wen
>            Priority: Major
>             Fix For: 2.4.0.0-incubating
>
>         Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime Filter Design.pdf,
HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 1970, which
is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications to quickly
filter data. It is usually implemented in OLAP systems for hash join. The basic idea is,
when hash join two tables, during the build phase, build a bloomfilter information for the
inner table, then push down this bloomfilter information to the scan of the outer table, so
that, less tuples from the outer table will be returned to hash join node and joined with
hash table. It can greatly improment the hash join performance if the selectivity is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message