flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pietro pinoli (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (FLINK-685) Add support for semi-joins
Date Tue, 21 Jul 2015 01:01:04 GMT

     [ https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

pietro pinoli reassigned FLINK-685:

    Assignee: pietro pinoli

> Add support for semi-joins
> --------------------------
>                 Key: FLINK-685
>                 URL: https://issues.apache.org/jira/browse/FLINK-685
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: GitHub Import
>            Assignee: pietro pinoli
>            Priority: Minor
>              Labels: github-import
>             Fix For: pre-apache
> A semi-join is basically a join filter. One input is "filtering" and the other one is
> A tuple of the "filtered" input is emitted exactly once if the "filtering" input has
one (ore more) tuples with matching join keys. That means that the output of a semi-join has
the same type as the "filtered" input and the "filtering" input is completely discarded.
> In order to support a semi-join, we need to add an additional physical execution strategy,
that ensures, that a tuple of the "filtered" input is emitted only once if the "filtering"
input has more than one tuple with matching keys. Furthermore, a couple of optimizations compared
to standard joins can be done such as storing only keys and not the full tuple of the "filtering"
input in a hash table.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/685
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Milestone: Release 0.6 (unplanned)
> Created at: Mon Apr 14 12:05:29 CEST 2014
> State: open

This message was sent by Atlassian JIRA

View raw message