spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jerryye <>
Subject spark.akka.frameSize stalls job in 1.1.0
Date Fri, 15 Aug 2014 15:18:34 GMT
Hi All,
I'm not sure if I should file a JIRA or if I'm missing something obvious
since the test code I'm trying is so simple. I've isolated the problem I'm
seeing to a memory issue but I don't know what parameter I need to tweak, it
does seem related to spark.akka.frameSize. If I sample my RDD with 35% of
the data, everything runs to completion, with more than 35%, it fails. In
standalone mode, I can run on the full RDD without any problems. 

// works 
val samples = sc.textFile("s3n://geonames").sample(false,0.35) // 64MB,
2849439 Lines 

// fails 
val samples = sc.textFile("s3n://geonames").sample(false,0.4) // 64MB,
2849439 Lines 

Any ideas? 

1) RDD size is causing the problem. The code below as is fails but if I swap
smallSample for samples, the code runs end to end on both cluster and
2) The error I get is: 
rg.apache.spark.SparkException: Job aborted due to stage failure: Task 3.0:1
failed 4 times, most recent failure: TID 12 on host failed for unknown reason 
Driver stacktrace: 

3) Using the 1.1.0 branch the driver freezes instead of aborting with the
previous error in #2.
4) In 1.1.0, changing spark.akka.frameSize also has the effect of no
progress in the driver.

val smallSample = sc.parallelize(Array("foo word", "bar word", "baz word")) 

val samples = sc.textFile("s3n://geonames") // 64MB, 2849439 Lines of short

val counts = new collection.mutable.HashMap[String, Int].withDefaultValue(0) 

samples.toArray.foreach(counts(_) += 1) 

val result = 
  l => (l, counts.get(l)) 


Settings (with or without Kryo doesn't matter): 
export SPARK_JAVA_OPTS="-Xms5g -Xmx10g -XX:MaxPermSize=10g" 
export SPARK_MEM=10g 
spark.akka.frameSize 40 
#spark.serializer org.apache.spark.serializer.KryoSerializer 
#spark.kryoserializer.buffer.mb 1000 
spark.executor.memory 58315m 
spark.executor.extraLibraryPath /root/ephemeral-hdfs/lib/native/ 
spark.executor.extraClassPath /root/ephemeral-hdfs/conf

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message