spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhu <ma...@madhu.com>
Subject Re: Detecting configuration problems
Date Tue, 08 Sep 2015 12:53:57 GMT
Thanks Akhil!

I suspect the root cause of the shuffle OOM I was seeing (and probably many
that users might see) is due to individual partitions on the reduce side not
fitting in memory. As a guideline, I was thinking of something like "be sure
that your largest partitions occupy no more then 1% of executor memory" or
something to that effect. I can add that documentation to the tuning page if
someone can suggest the the best wording and numbers. I can also add a
simple Spark shell example to estimate largest partition size to determine
executor memory and number of partitions.

One more question: I'm trying to get my head around the shuffle code. I see
ShuffleManager, but that seems to be on the reduce side. Where is the code
driving the map side writes and reduce reads? I think it is possible to add
up reduce side volume for a key (they are byte reads at some point) and
raise an alarm if it's getting too high. Even a warning on the console would
be better than a catastrophic OOM.



-----
--
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Detecting-configuration-problems-tp13980p13998.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message