spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-15441) dataset outer join seems to return incorrect result
Date Thu, 26 May 2016 07:12:15 GMT

     [ https://issues.apache.org/jira/browse/SPARK-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-15441:
------------------------------------

    Assignee: Wenchen Fan  (was: Apache Spark)

> dataset outer join seems to return incorrect result
> ---------------------------------------------------
>
>                 Key: SPARK-15441
>                 URL: https://issues.apache.org/jira/browse/SPARK-15441
>             Project: Spark
>          Issue Type: Bug
>          Components: sq;
>            Reporter: Reynold Xin
>            Assignee: Wenchen Fan
>            Priority: Critical
>
> See notebook
> https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6122906529858466/2836020637783173/5382278320999420/latest.html
> {code}
> import org.apache.spark.sql.functions
> val left = List(("a", 1), ("a", 2), ("b", 3), ("c", 4)).toDS()
> val right = List(("a", "x"), ("b", "y"), ("d", "z")).toDS()
> // The last row _1 should be null, rather than (null, -1)
> left.toDF("k", "v").as[(String, Int)].alias("left")
>   .joinWith(right.toDF("k", "u").as[(String, String)].alias("right"), functions.col("left.k")
=== functions.col("right.k"), "right_outer")
>   .show()
> {code}
> The returned result currently is
> {code}
> +---------+-----+
> |       _1|   _2|
> +---------+-----+
> |    (a,2)|(a,x)|
> |    (a,1)|(a,x)|
> |    (b,3)|(b,y)|
> |(null,-1)|(d,z)|
> +---------+-----+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message