spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ram Kandasamy (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SPARK-11430) DataFrame's except method does not work, returns 0
Date Fri, 30 Oct 2015 21:56:27 GMT

     [ https://issues.apache.org/jira/browse/SPARK-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ram Kandasamy closed SPARK-11430.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.1

> DataFrame's except method does not work, returns 0
> --------------------------------------------------
>
>                 Key: SPARK-11430
>                 URL: https://issues.apache.org/jira/browse/SPARK-11430
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Ram Kandasamy
>             Fix For: 1.5.1
>
>
> This may or may not be related to this bug here: https://issues.apache.org/jira/browse/SPARK-11427
> But basically, the except method in dataframes should mirror the functionality of the
subtract method in RDDs, but it is not doing so.
> Here is an example:
> scala> val firstFile = sqlContext.read.parquet("/Users/ramkandasamy/sparkData/2015-07-25/*").select("id").distinct
> firstFile: org.apache.spark.sql.DataFrame = [id: string]
> scala> val secondFile = sqlContext.read.parquet("/Users/ramkandasamy/sparkData/2015-10-23/*").select("id").distinct
> secondFile: org.apache.spark.sql.DataFrame = [id: string]
> scala> firstFile.count
> res1: Long = 1072046
> scala> secondFile.count
> res2: Long = 3569941
> scala> firstFile.except(secondFile).count
> res3: Long = 0
> scala> firstFile.rdd.subtract(secondFile.rdd).count
> res4: Long = 1072046
> Can anyone help out here? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message