spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Detecting configuration problems
Date Tue, 08 Sep 2015 10:08:46 GMT
I found an old JIRA referring the same.
https://issues.apache.org/jira/browse/SPARK-5421

Thanks
Best Regards

On Sun, Sep 6, 2015 at 8:53 PM, Madhu <madhu@madhu.com> wrote:

> I'm not sure if this has been discussed already, if so, please point me to
> the thread and/or related JIRA.
>
> I have been running with about 1TB volume on a 20 node D2 cluster (255
> GiB/node).
> I have uniformly distributed data, so skew is not a problem.
>
> I found that default settings (or wrong setting) for driver and executor
> memory caused out of memory exceptions during shuffle (subtractByKey to be
> exact). This was not easy to track down, for me at least.
>
> Once I bumped up driver to 12G and executor to 10G with 300 executors and
> 3000 partitions, shuffle worked quite well (12 mins for subtractByKey). I'm
> sure there are more improvement to made, but it's a lot better than heap
> space exceptions!
>
> From my reading, the shuffle OOM problem is in ExternalAppendOnlyMap or
> similar disk backed collection.
> I have some familiarity with that code based on previous work with external
> sorting.
>
> Is it possible to detect misconfiguration that leads to these OOMs and
> produce a more meaningful error messages? I think that would really help
> users who might not understand all the inner workings and configuration of
> Spark (myself included). As it is, heap space issues are a challenge and
> does not present Spark in a positive light.
>
> I can help with that effort if someone is willing to point me to the
> precise
> location of memory pressure during shuffle.
>
> Thanks!
>
>
>
> -----
> --
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Detecting-configuration-problems-tp13980.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
View raw message