spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe Chong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-16676) Spark jobs stay in pending
Date Tue, 26 Jul 2016 17:04:20 GMT

    [ https://issues.apache.org/jira/browse/SPARK-16676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394094#comment-15394094
] 

Joe Chong commented on SPARK-16676:
-----------------------------------

Unfortunately, that's all the logs available in the console, as pasted above. Any idea on
where else to look? 

> Spark jobs stay in pending
> --------------------------
>
>                 Key: SPARK-16676
>                 URL: https://issues.apache.org/jira/browse/SPARK-16676
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Spark Shell
>    Affects Versions: 1.5.2
>         Environment: Mac OS X Yosemite, Terminal, Spark-shell standalone
>            Reporter: Joe Chong
>         Attachments: Spark UI stays @ pending.png
>
>
> I've been having issues executing certain Scala statements within the Spark-Shell. These
statements are obtained through tutorial/blog written by Carol McDonald in MapR. 
> The import statements, reading text files into DataFrames are OK. However, when I try
to do df.show(), the execution hits a road block. Checking the Spark UI job, I see that the
Stage's active, however, 1 of its dependent job stays in Pending without any movement. The
logs are as below. 
> scala> fltCountsql.show()
> 16/07/22 11:40:16 INFO spark.SparkContext: Starting job: show at <console>:46
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Registering RDD 31 (show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Got job 4 (show at <console>:46)
with 200 output partitions
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Final stage: ResultStage 8(show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage
7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage
7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 7 (MapPartitionsRDD[31]
at show at <console>:46), which has no missing parents
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(18128) called with curMem=115755879,
maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5 stored as values in memory
(estimated size 17.7 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(7527) called with curMem=115774007,
maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5_piece0 stored as bytes
in memory (estimated size 7.4 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on
localhost:61408 (size: 7.4 KB, free: 2.5 GB)
> 16/07/22 11:40:16 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:861
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage
7 (MapPartitionsRDD[31] at show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.TaskSchedulerImpl: Adding task set 7.0 with 2 tasks
> 16/07/22 11:40:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID
4, localhost, PROCESS_LOCAL, 2156 bytes)
> 16/07/22 11:40:16 INFO executor.Executor: Running task 0.0 in stage 7.0 (TID 4)
> 16/07/22 11:40:16 INFO storage.BlockManager: Found block rdd_2_0 locally
> 16/07/22 11:40:17 INFO executor.Executor: Finished task 0.0 in stage 7.0 (TID 4). 2738
bytes result sent to driver
> 16/07/22 11:40:17 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID
4) in 920 ms on localhost (1/2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message