spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sun Rui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame
Date Wed, 20 Jul 2016 06:08:20 GMT

    [ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385418#comment-15385418
] 

Sun Rui commented on SPARK-16464:
---------------------------------

This bug in SparkR has been fixed in spark 2.0, but exists in 1.6.x. please refer to https://issues.apache.org/jira/browse/SPARK-12204

> withColumn() allows illegal creation of duplicate column names on DataFrame
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-16464
>                 URL: https://issues.apache.org/jira/browse/SPARK-16464
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR, SQL
>    Affects Versions: 1.6.1
>         Environment: Databricks.com
>            Reporter: Neil Dewar
>            Priority: Minor
>
> If I take an existing DataFrame, I am permitted to use withColumn() to create a duplicate
column name.  I assume this should be illegal, and withColumn should be prevented from permitting
this.  Some functions subsequently fail due to the duplicate column names.  Example:
> sdfCar <- createDataFrame(sqlContext, mtcars)
> sdfCar1 <- withColumn(sdfCar, "isEfficient", sdfCar$mpg<=20)
> sdfCar1 <- withColumn(sdfCar1, "isEfficient", ifelse(sdfCar1$mpg == sdfCar1$mpg,1,0))
> sdfCar2 <- subset(sdfCar1, select=sdfCar1$isEfficient)
> # subset() command fails with message: "Reference 'isEfficient' is ambiguous"
> Note: I only know if this is SparkR - it might affect other languages APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message