Return-Path: X-Original-To: apmail-spark-dev-archive@minotaur.apache.org Delivered-To: apmail-spark-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BB27D18959 for ; Sun, 18 Oct 2015 19:55:19 +0000 (UTC) Received: (qmail 74321 invoked by uid 500); 18 Oct 2015 19:55:18 -0000 Delivered-To: apmail-spark-dev-archive@spark.apache.org Received: (qmail 74218 invoked by uid 500); 18 Oct 2015 19:55:18 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spark.apache.org Received: (qmail 74208 invoked by uid 99); 18 Oct 2015 19:55:18 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Oct 2015 19:55:18 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AB3E2180EC8 for ; Sun, 18 Oct 2015 19:55:17 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.487 X-Spam-Level: *** X-Spam-Status: No, score=3.487 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, NML_ADSP_CUSTOM_MED=1.2, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id BWhEwatKz1Tp for ; Sun, 18 Oct 2015 19:55:05 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id 58958205E9 for ; Sun, 18 Oct 2015 19:55:05 +0000 (UTC) Received: from mben.nabble.com (unknown [162.253.133.72]) by mwork.nabble.com (Postfix) with ESMTP id DC9D92B72958 for ; Sun, 18 Oct 2015 12:55:55 -0700 (PDT) Date: Sun, 18 Oct 2015 12:55:04 -0700 (MST) From: gsvic To: dev@spark.apache.org Message-ID: <1445198104918-14672.post@n3.nabble.com> Subject: ShuffledHashJoin Possible Issue MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I am doing some experiments with join algorithms in SparkSQL and I am facing the following issue: I have costructed two "dummy" json tables, t1.json and t2.json. Each of them has two columns, ID and Value. The ID is an incremental integer(unique) and the Value a random value. I am running an equi-join query on ID attribute. In case of SortMerge and BroadcastHashJoin algorithms, the return result is correct but in case of ShuffledHashJoin the count aggregate returns always zero. The correct result is t2, as t2.ID is a subset of t1.ID. The query is *t1.join(t2).where(t1("ID").equalTo(t2("ID")))* -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ShuffledHashJoin-Possible-Issue-tp14672.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org For additional commands, e-mail: dev-help@spark.apache.org