spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: java options for spark-1.0.0
Date Wed, 02 Jul 2014 18:33:57 GMT
Try looking at the running processes with “ps” to see their full command line and see whether
any options are different. It seems like in both cases, your young generation is quite large
(11 GB), which doesn’t make lot of sense with a heap of 15 GB. But maybe I’m misreading


On Jul 2, 2014, at 4:50 AM, Wanda Hawk <> wrote:

> I ran SparkKMeans with a big file (~ 7 GB of data) for one iteration with spark-0.8.0
with this line in bash.rc " export _JAVA_OPTIONS="-Xmx15g -Xms15g -verbose:gc -XX:+PrintGCTimeStamps
-XX:+PrintGCDetails" ". It finished in a decent time, ~50 seconds, and I had only a few "Full
GC...." messages from Java. (a max of 4-5)
> Now, using the same export in bash.rc but with spark-1.0.0  (and running it with spark-submit)
the first loop never finishes and  I get a lot of:
> "18.537: [GC (Allocation Failure) --[PSYoungGen: 11796992K->11796992K(13762560K)]
11797442K->11797450K(13763072K), 2.8420311 secs] [Times: user=5.81 sys=2.12, real=2.85
> "
> or 
>  "31.867: [Full GC (Ergonomics) [PSYoungGen: 11796992K->3177967K(13762560K)] [ParOldGen:
505K->505K(512K)] 11797497K->3178473K(13763072K), [Metaspace: 37646K->37646K(1081344K)],
2.3053283 secs] [Times: user=37.74 sys=0.11, real=2.31 secs]"
> I tried passing different parameters for the JVM through spark-submit, but the results
are the same
> This happens with java 1.7 and also with java 1.8.
> I do not know what the "Ergonomics" stands for ...
> How can I get a decent performance from spark-1.0.0 considering that spark-0.8.0 did
not need any fine tuning on the gargage collection method (the default worked well) ?
> Thank you

View raw message