spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jerryye <>
Subject saveAsTextFile makes no progress without caching RDD
Date Fri, 22 Aug 2014 00:13:50 GMT
Cross-posting this from users list.

I'm running on branch-1.1 and trying to do a simple transformation to a
relatively small dataset of 64GB and saveAsTextFile essentially hangs and
tasks are stuck in running mode with the following code: 

// Stalls with tasks running for over an hour with no tasks finishing.
Smallest partition is 10MB 
val data = sc.textFile("s3n://input") 
val reformatted = =>

// This runs but stalls doing GC after filling up 150% of 650GB of memory 
val data = sc.textFile("s3n://input") 
val reformatted = =>

Any idea if this is a parameter issue and there is something I should try


- jerry 

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message