Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 24 Feb 2016 18:30:18 +0000 (UTC)
From: "Apache Spark (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.12943031.1456338053000.135896.1456338618218@Atlassian.JIRA>
In-Reply-To: <JIRA.12943031.1456338053000@Atlassian.JIRA>
References: <JIRA.12943031.1456338053000@Atlassian.JIRA>
 <JIRA.12943031.1456338053568@arcas>
Subject: [jira] [Assigned] (SPARK-13473) Predicate can't be pushed through
 project with nondeterministic field
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/SPARK-13473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-13473:
------------------------------------

    Assignee: Cheng Lian  (was: Apache Spark)

> Predicate can't be pushed through project with nondeterministic field
> ---------------------------------------------------------------------
>
>                 Key: SPARK-13473
>                 URL: https://issues.apache.org/jira/browse/SPARK-13473
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.2, 1.6.0, 2.0.0
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>
> The following Spark shell snippet reproduces this issue:
> {code}
> import org.apache.spark.sql.functions._
> val parallelism = 8 // Adjust this to default parallelism
> val df = sqlContext.
>   range(2 * parallelism). // 8 partitions, 2 elements per partition
>   select(
>     col("id"),
>     monotonicallyIncreasingId().as("long_id")
>   )
> df.show()
> // +---+-----------+
> // | id|    long_id|
> // +---+-----------+
> // |  0|          0|
> // |  1|          1|
> // |  2| 8589934592|
> // |  3| 8589934593|
> // |  4|17179869184|
> // |  5|17179869185|
> // |  6|25769803776|
> // |  7|25769803777|
> // |  8|34359738368|
> // |  9|34359738369|
> // | 10|42949672960|
> // | 11|42949672961|
> // | 12|51539607552|
> // | 13|51539607553|
> // | 14|60129542144|
> // | 15|60129542145|
> // +---+-----------+
> df.
>   filter(col("id") === 3). // 2nd element in the 2nd partition
>   show()
> // +---+----------+
> // | id|   long_id|
> // +---+----------+
> // |  3|8589934592|
> // +---+----------+
> {code}
> {{monotonicallyIncreasingId}} is nondeterministic.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org