spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ziv Huang (JIRA)" <>
Subject [jira] [Commented] (SPARK-3687) Spark hang while processing more than 100 sequence files
Date Thu, 25 Sep 2014 07:28:33 GMT


Ziv Huang commented on SPARK-3687:

Just a few mins ago I ran a job twice, processing 203 sequence files.
Both times I saw the job hanging with different behavior from before: 
1. the web UI of spark master shows that the job is finished with state "failed" after 3.x
2. the job stage web UI still hangs, and execution duration time is still accumulating.
Hope this information helps debugging :)

> Spark hang while processing more than 100 sequence files
> --------------------------------------------------------
>                 Key: SPARK-3687
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0
>            Reporter: Ziv Huang
> In my application, I read more than 100 sequence files to a JavaPairRDD, perform flatmap
to get another JavaRDD, and then use takeOrdered to get the result.
> It is quite often (but not always) that the spark hangs while the executing some of 110th-130th
> The job can hang for several hours, maybe forever (I can't wait for its completion).
> When the spark job hangs, I can't find any error message in anywhere, and I can't kill
the job from web UI.
> The current workaround is to use coalesce to reduce the number of partitions to be processed.
> I never get a job hanged if the number of partitions to be processed is no greater than

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message