spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fernando Pereira (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-21172) EOFException reached end of stream in UnsafeRowSerializer
Date Mon, 15 Jan 2018 11:28:02 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326122#comment-16326122
] 

Fernando Pereira edited comment on SPARK-21172 at 1/15/18 11:27 AM:
--------------------------------------------------------------------

With my previous small dataset I was able to make it run by changing the number of partitions
(spark.sql.shuffle.partitions) to something more standard.

However, now with a 200GB dataset, there isn't a setting that can make it work. I always get
the problem sooner of later, sometimes when more than 1000 partitions have been processed.

I really believe that, by the fact that by we can help it by tuning config values, we  should
have some bug in the shuffling read which doesn't handle all corner cases.

Another symptom is this error message occurring in other workers:

{code}
java.lang.IndexOutOfBoundsException: len is negative at 
  org.spark_project.guava.io.ByteStreams.read(ByteStreams.java:895) at 
  org.spark_project.guava.io.ByteStreams.readFully(ByteStreams.java:733) at 
  org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:127)
at 
  org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:110)
at scala.collection.Iterator$$anon$12.next(Iterator.scala:444) at 
 scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at 
  org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30) at 
  org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:40) }}
{code}


was (Author: ferdonline):
With my previous small dataset I was able to make it run by changing the number of partitions
(spark.sql.shuffle.partitions) to something more standard.

However, now with a 200GB dataset, there isn't a setting that can make it work. I always get
the problem sooner of later, sometimes when more than 1000 partitions have been processed.

I really believe that, by the fact that by we can help it by tuning config values, we  should
have some bug in the shuffling read which doesn't handle all corner cases.

> EOFException reached end of stream in UnsafeRowSerializer
> ---------------------------------------------------------
>
>                 Key: SPARK-21172
>                 URL: https://issues.apache.org/jira/browse/SPARK-21172
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 2.0.1
>            Reporter: liupengcheng
>            Priority: Major
>              Labels: shuffle
>
> Spark sql job failed because of the following Exception. Seems like a bug in shuffle
stage. 
> Shuffle read size for single task is tens of GB
> {code}
> org.apache.spark.SparkException: Task failed while writing rows
> 	at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:264)
> 	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> 	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:86)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException: reached end of stream after reading 9034374 bytes; 1684891936
bytes expected
> 	at org.spark_project.guava.io.ByteStreams.readFully(ByteStreams.java:735)
> 	at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
> 	at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
> 	at scala.collection.Iterator$$anon$12.next(Iterator.scala:444)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> 	at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
> 	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> 	at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:255)
> 	at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:253)
> 	at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:253)
> 	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1345)
> 	at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:259)
> 	... 8 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message