spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Barry Becker (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-19317) UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized
Date Fri, 20 Jan 2017 23:45:26 GMT
Barry Becker created SPARK-19317:
------------------------------------

             Summary: UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized
                 Key: SPARK-19317
                 URL: https://issues.apache.org/jira/browse/SPARK-19317
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.1.0
            Reporter: Barry Becker


I wish I had more of a simple reproducible case to give, but I got the below exception while
selecting null values in one of the columns of a dataframe.
My client code that failed was 
df.filter(filterExp).count()
where the filter expression was something like someColumn.isNull.
There were 412 nulls out of 716,000 total rows for the column being filtered.
Its odd because I have a different, smaller dataset where I did the same thing on a column
with 100 nulls out of 800 and did not get the error.
The exception seems to indicate that spark is trying to do reduceLeft on an empy list.

{code}
java.lang.UnsupportedOperationException: empty.reduceLeftscala.collection.LinearSeqOptimized$class.reduceLeft(LinearSeqOptimized.scala:137)
scala.collection.immutable.List.reduceLeft(List.scala:84) scala.collection.TraversableOnce$class.reduce(TraversableOnce.scala:208)
scala.collection.AbstractTraversable.reduce(Traversable.scala:104) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:90)
org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:54)
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:61)
org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:54)
scala.PartialFunction$Lifted.apply(PartialFunction.scala:223) scala.PartialFunction$Lifted.apply(PartialFunction.scala:219)
org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$2.apply(InMemoryTableScanExec.scala:95)
org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$2.apply(InMemoryTableScanExec.scala:94)
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
scala.collection.immutable.List.foreach(List.scala:381) scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
scala.collection.immutable.List.flatMap(List.scala:344) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec.(InMemoryTableScanExec.scala:94)
org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$$anonfun$6.apply(SparkStrategies.scala:306)
org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$$anonfun$6.apply(SparkStrategies.scala:306)
org.apache.spark.sql.execution.SparkPlanner.pruneFilterProject(SparkPlanner.scala:96) org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$.apply(SparkStrategies.scala:302)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62)
scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:77)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74)
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
scala.collection.Iterator$class.foreach(Iterator.scala:893) scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:74)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66)
scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:77)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74)
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
scala.collection.Iterator$class.foreach(Iterator.scala:893) scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:74)
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66)
scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92) org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:79)
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:75) org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:84)
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:84) org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2774)
org.apache.spark.sql.Dataset.count(Dataset.scala:2404) mypackage.Selection(Selection.scala:34)

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message