spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohai Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-4906) Spark master OOMs with exception stack trace stored in JobProgressListener
Date Thu, 31 Mar 2016 20:25:25 GMT

    [ https://issues.apache.org/jira/browse/SPARK-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220609#comment-15220609
] 

Haohai Ma commented on SPARK-4906:
----------------------------------

We just hit the similar issue recently by Spark Master OOM. A detailed retained memory report
is attached. 

> Spark master OOMs with exception stack trace stored in JobProgressListener
> --------------------------------------------------------------------------
>
>                 Key: SPARK-4906
>                 URL: https://issues.apache.org/jira/browse/SPARK-4906
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.1.1
>            Reporter: Mingyu Kim
>         Attachments: LeakingJobProgressListener2OOM.docx
>
>
> Spark master was OOMing with a lot of stack traces retained in JobProgressListener. The
object dependency goes like the following.
> JobProgressListener.stageIdToData => StageUIData.taskData => TaskUIData.errorMessage
> Each error message is ~10kb since it has the entire stack trace. As we have a lot of
tasks, when all of the tasks across multiple stages go bad, these error messages accounted
for 0.5GB of heap at some point.
> Please correct me if I'm wrong, but it looks like all the task info for running applications
are kept in memory, which means it's almost always bound to OOM for long-running applications.
Would it make sense to fix this, for example, by spilling some UI states to disk?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message