Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 404C619E96 for ; Sat, 30 Apr 2016 05:29:13 +0000 (UTC) Received: (qmail 45075 invoked by uid 500); 30 Apr 2016 05:29:13 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 45003 invoked by uid 500); 30 Apr 2016 05:29:13 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 44981 invoked by uid 99); 30 Apr 2016 05:29:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Apr 2016 05:29:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id DC5882C1F62 for ; Sat, 30 Apr 2016 05:29:12 +0000 (UTC) Date: Sat, 30 Apr 2016 05:29:12 +0000 (UTC) From: "Davies Liu (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14757: ------------------------------- Assignee: Reynold Xin > Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table > ------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-14757 > URL: https://issues.apache.org/jira/browse/SPARK-14757 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.0 > Reporter: Hong Huang > Assignee: Reynold Xin > Fix For: 1.6.2, 2.0.0 > > > Content of table a: > |outgoing_0| > | false | > | true | > | null | > a has only one field: outgoing_0 > Content of table b: > |outgoing_1| > | false | > | true | > | null | > b has only one filed: outgoing_1 > After running this query: > {code} > select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1) > {code} > I got the following result: > |outgoing_0|outgoing_1| > | true | true | > | false | false | > | false | null | > | null | null | > The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The operator <=> should match null with null. > While left "false" is matched with right "null", it is also strange to find that the "false" on the right table does not match with "null" on the left table (no row with "null" as outgoing_0 and "false" as outgoing_1) > You can easily reproduce this bug by pasting the following code fragment: > {code} > case class A( outgoing_0: Option[Boolean] ) > case class B( outgoing_1: Option[Boolean] ) > {code} > {code} > val a = sc.parallelize( Seq( > A( Some( false ) ), > A( Some( true ) ), > A( None ) > ) ).toDF() > a.show > val b = sc.parallelize( Seq( > B( Some( false ) ), > B( Some( true ) ), > B( None ) > ) ).toDF() > b.show > a.registerTempTable( "a" ) > b.registerTempTable( "b" ) > sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" ).show() > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org