Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8128E200B31 for ; Tue, 24 May 2016 14:42:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7FB44160A36; Tue, 24 May 2016 12:42:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C3B4C160A34 for ; Tue, 24 May 2016 14:42:13 +0200 (CEST) Received: (qmail 81242 invoked by uid 500); 24 May 2016 12:42:13 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 81222 invoked by uid 99); 24 May 2016 12:42:12 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 May 2016 12:42:12 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id DA54A2C14DC for ; Tue, 24 May 2016 12:42:12 +0000 (UTC) Date: Tue, 24 May 2016 12:42:12 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 24 May 2016 12:42:14 -0000 [ https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298105#comment-15298105 ] ASF GitHub Bot commented on FLINK-3941: --------------------------------------- Github user yjshen commented on a diff in the pull request: https://github.com/apache/flink/pull/2025#discussion_r64382835 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala --- @@ -69,16 +73,23 @@ class DataSetUnion( rows + metadata.getRowCount(child) } - planner.getCostFactory.makeCost(rowCnt, 0, 0) + planner.getCostFactory.makeCost( + rowCnt, + if (all) 0 else rowCnt, + if (all) 0 else rowCnt) } override def translateToPlan( tableEnv: BatchTableEnvironment, expectedType: Option[TypeInformation[Any]]): DataSet[Any] = { - val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv) - leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + val leftDataSet = left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + val rightDataSet = right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType) + if (all) { + leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]] + } else { + leftDataSet.union(rightDataSet).distinct().asInstanceOf[DataSet[Any]] --- End diff -- In `DATASET_OPT_RULES`, `UnionToDistinctRule` substitute `Union` with `UnionAll` followed by an `Aggregate`, therefore this branch doesn't actually get executed. > Add support for UNION (with duplicate elimination) > -------------------------------------------------- > > Key: FLINK-3941 > URL: https://issues.apache.org/jira/browse/FLINK-3941 > Project: Flink > Issue Type: New Feature > Components: Table API > Affects Versions: 1.1.0 > Reporter: Fabian Hueske > Assignee: Yijie Shen > Priority: Minor > > Currently, only UNION ALL is supported by Table API and SQL. > UNION (with duplicate elimination) can be supported by applying a {{DataSet.distinct()}} after the union on all fields. This issue includes: > - Extending {{DataSetUnion}} > - Relaxing {{DataSetUnionRule}} to translated non-all unions. > - Extend the Table API with union() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)