spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Herman van Hovell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions
Date Mon, 07 May 2018 15:55:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Herman van Hovell resolved SPARK-24043.
---------------------------------------
       Resolution: Fixed
         Assignee: Bruce Robbins
    Fix Version/s: 2.4.0

> InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-24043
>                 URL: https://issues.apache.org/jira/browse/SPARK-24043
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Bruce Robbins
>            Assignee: Bruce Robbins
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> When whole-stage codegen and predicate codegen both fail, FilterExec falls back to
using InterpretedPredicate. If the predicate's expression contains any non-deterministic expressions,
the evaluation throws an error:
> {noformat}
> scala> val df = Seq((1)).toDF("a")
> df: org.apache.spark.sql.DataFrame = [a: int]
> scala> df.filter('a > 0).show // this works fine
> 2018-04-21 20:39:26 WARN  FilterExec:66 - Codegen disabled for this expression:
>  (value#1 > 0)
> +---+
> |  a|
> +---+
> |  1|
> +---+
> scala> df.filter('a > rand(7)).show // this will throw an error
> 2018-04-21 20:39:40 WARN  FilterExec:66 - Codegen disabled for this expression:
>  (cast(value#1 as double) > rand(7))
> 2018-04-21 20:39:40 ERROR Executor:91 - Exception in task 0.0 in stage 1.0 (TID 1)
> java.lang.IllegalArgumentException: requirement failed: Nondeterministic expression org.apache.spark.sql.catalyst.expressions.Rand
should be initialized before eval.
> 	at scala.Predef$.require(Predef.scala:224)
> 	at org.apache.spark.sql.catalyst.expressions.Nondeterministic$class.eval(Expression.scala:326)
> 	at org.apache.spark.sql.catalyst.expressions.RDG.eval(randomExpressions.scala:34)
> {noformat}
> This is because no code initializes the Nondeterministic expressions before eval is called
on them.
> This is a low impact issue, since it would require both whole-stage codegen and predicate
codegen to fail before FilterExec would fall back to using InterpretedPredicate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message