Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 18B0F200BD9 for ; Thu, 24 Nov 2016 15:03:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1761D160B1F; Thu, 24 Nov 2016 14:03:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 67891160B11 for ; Thu, 24 Nov 2016 15:02:59 +0100 (CET) Received: (qmail 13983 invoked by uid 500); 24 Nov 2016 14:02:58 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 13970 invoked by uid 99); 24 Nov 2016 14:02:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 14:02:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 63BC72C022B for ; Thu, 24 Nov 2016 14:02:58 +0000 (UTC) Date: Thu, 24 Nov 2016 14:02:58 +0000 (UTC) From: "Takeshi Yamamuro (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-18577) Ambiguous reference with duplicate column names in aggregate MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 24 Nov 2016 14:03:00 -0000 [ https://issues.apache.org/jira/browse/SPARK-18577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693356#comment-15693356 ] Takeshi Yamamuro commented on SPARK-18577: ------------------------------------------ I reproduced this in master though, do we need to resolve this reference? For example, postgresql cannot resolve the same query; {code} postgres=# CREATE TABLE t(id INT, name VARCHAR, rank FLOAT8); CREATE TABLE postgres=# \d t Table "public.t" Column | Type | Modifiers --------+-------------------+----------- id | integer | name | character varying | rank | double precision | postgres=# INSERT INTO t values(1, 'xxx', 1.0); INSERT 0 1 postgres=# SELECT * FROM t; id | name | rank ----+------+------ 1 | xxx | 1 (1 row) postgres=# SELECT id, COUNT(*) FROM t t1 JOIN t t2 ON t1.name = t2.name GROUP BY t1.id; ERROR: column reference "id" is ambiguous at character 8 {code} > Ambiguous reference with duplicate column names in aggregate > ------------------------------------------------------------ > > Key: SPARK-18577 > URL: https://issues.apache.org/jira/browse/SPARK-18577 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.2 > Reporter: Yerui Sun > > Assuming we have a table 't' with 3 columns 'id', 'name' and 'rank', and here's the sql to re-produce issue: > {code} > select id, count(*) from t t1 join t t2 on t1.name = t2.name group by t1.id > {code} > The error message is: > {code} > Reference 'id' is ambiguous, could be: id#3, id#9.; line 1 pos 7 > {code} > The sql can be parsed in Hive, since the select 'id' reference can be resolved to 't1.id', which presented in group expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org